Close Menu
Versa AI hub
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

What's Hot

Utah has enacted AI fixes targeting mental health chatbots and generation AI | Sheppard Mullin Richter & Hampton LLP

May 19, 2025

The growing issues regarding social media AI

May 19, 2025

Introducing the Hebrew LLMS open leaderboard!

May 19, 2025
Facebook X (Twitter) Instagram
Versa AI hubVersa AI hub
Monday, May 19
Facebook X (Twitter) Instagram
Login
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
Versa AI hub
Home»Tools»Introducing the Hebrew LLMS open leaderboard!
Tools

Introducing the Hebrew LLMS open leaderboard!

versatileaiBy versatileaiMay 19, 2025No Comments4 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
#image_title
Share
Facebook Twitter LinkedIn Pinterest Email




Omer Koren's avatar




This project addresses the key needs of the advancement of the Hebrew NLP. As Hebrew is considered a low-resource language, existing LLM leaderboards often do not have a benchmark that accurately reflects its unique characteristics. Today we are excited to introduce pioneering efforts to change this narrative. This is a new Open LLM leaderboard designed specifically to evaluate and enhance Hebrew language models.

Hebrew is a morphologically rich language with complex systems of roots and patterns. Words are constructed from roots with prefixes, suffixes, and infixes used to modify meaning, tension, or plural forms (among other features). This complexity leads to the existence of multiple valid word forms derived from a single route, effectively creating traditional tokenization strategies designed for morphologically simple languages. As a result, existing language models can have difficulty handling and understanding the nuances of Hebrew words accurately, highlighting the need for benchmarks that cater to these unique linguistic properties.

Therefore, LLM studies in Hebrew require a dedicated benchmark that specializes in language nuances and linguistic characteristics. Our leaderboard is set to fill this blank by providing robust assessment metrics on language-specific tasks and promoting open community-driven enhancement of Hebrew generative language models. We believe the initiative will become a platform for researchers and developers to share, compare and improve Hebrew LLM.

Leaderboard Metrics and Tasks

We developed four important datasets designed to test linguistic models for understanding and generation of Hebrew, regardless of their performance in other languages. These benchmarks use several shot prompt formats to evaluate the model, allowing them to adapt and respond correctly even in limited contexts.

Below is a summary of each benchmark included in the leaderboard. Visit the Leaderboard tab for a more comprehensive breakdown of each dataset, scoring system and quick builds.

Hebrew Question Answer: This task evaluates the ability of the model to understand and process the information presented in Hebrew, and focuses on accurate searching for understanding and answers based on context. Check the understanding of the Hebrew syntax and semantics of the model in the form of direct questions and answers.

Source: Test subset of HEQ dataset.

Emotional Accuracy: This benchmark tests the ability of the model to detect and interpret emotions in Hebrew texts. Evaluate the ability of the model to accurately classify statements as positive, negative, or neutral based on language cues.

Winograd Schema Challenge: Tasks are designed to measure the understanding of Hebrew pronoun resolution and models of contextual ambiguity. It tests the ability of models to correctly explain pronouns in complex sentences using logical reasoning and general world knowledge.

Translation: This task evaluates the proficiency of the model in translation between English and Hebrew. It assesses language accuracy, flow ency, and ability to maintain overall language meaning and emphasizes the ability of the model in bilingual translation tasks.

Technology setup

The leaderboard is inspired by the open LLM leaderboard and uses demo leaderboard templates. The submitted models are automatically deployed using Huggingface’s inference endpoints and evaluated via API requests managed by Lighteval Library. Implementation is simple, with the main task being to set up the environment. The rest of the code ran smoothly.

Please engage with us

We invite researchers, developers and enthusiasts to participate in this initiative. Whether you are interested in submitting a model for evaluation or taking part in discussions about improving language techniques in Hebrew, your contributions are important. For guidelines on how to submit models for evaluation, visit the Submit page on the Leaderboard or join the Leaderboard’s HF Space discussion page.

This new leaderboard is more than just a benchmark tool. We hope that the Israeli technological community will recognize and encourage the gaps in language technology research in Hebrew. By providing detailed, specific assessments, it aims to catalyze the development of not only linguistically diverse but culturally accurate models, paving the way for innovation that respects the richness of Hebrew. Take this exciting journey and recreate your language modeling landscape!

Sponsorship

The leaderboard is sponsored by DDR&D IMOD for Hebrew and Arabic Hebrew and Arabic NLP/The Israeli National Program: The Dicta: The Israel Center for Text Analysis and Webikes. I would like to extend my gratitude to Professor Reut Tsarfaty of Bar-Ilan University for scientific consultation and guidance.

author avatar
versatileai
See Full Bio
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleElon Musk’s AI company says Grok Chatbot was focused on racial politics in South Africa.
Next Article The growing issues regarding social media AI
versatileai

Related Posts

Tools

Subscribe to Enterprise Hub with your AWS account

May 19, 2025
Tools

Building cost-effective enterprise RAG applications using Intel Gaudi 2 and Intel Xeon

May 18, 2025
Tools

Face x Langchain embrace: a new partner package

May 17, 2025
Add A Comment
Leave A Reply Cancel Reply

Top Posts

The UAE announces bold AI-led plans to revolutionize the law

April 22, 20253 Views

The UAE will use artificial intelligence to develop new laws

April 22, 20253 Views

New report on national security risks from weakened AI safety frameworks

April 22, 20253 Views
Stay In Touch
  • YouTube
  • TikTok
  • Twitter
  • Instagram
  • Threads
Latest Reviews

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

Most Popular

The UAE announces bold AI-led plans to revolutionize the law

April 22, 20253 Views

The UAE will use artificial intelligence to develop new laws

April 22, 20253 Views

New report on national security risks from weakened AI safety frameworks

April 22, 20253 Views
Don't Miss

Utah has enacted AI fixes targeting mental health chatbots and generation AI | Sheppard Mullin Richter & Hampton LLP

May 19, 2025

The growing issues regarding social media AI

May 19, 2025

Introducing the Hebrew LLMS open leaderboard!

May 19, 2025
Service Area
X (Twitter) Instagram YouTube TikTok Threads RSS
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2025 Versa AI Hub. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?