Close Menu
Versa AI hub
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

What's Hot

CAC has announced AI-powered business registration portal – thisdaylive

July 3, 2025

Research shows that AI can reduce global carbon emissions

July 3, 2025

AI Art Challenge: Everyday Giants will showcase the creativity of AI generated in 2025 | AI News Details

July 2, 2025
Facebook X (Twitter) Instagram
Versa AI hubVersa AI hub
Thursday, July 3
Facebook X (Twitter) Instagram
Login
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
Versa AI hub
Home»Tools»Allow communities to use Argilla to embrace face spaces to collectively build better datasets
Tools

Allow communities to use Argilla to embrace face spaces to collectively build better datasets

versatileaiBy versatileaiJuly 2, 2025No Comments4 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
#image_title
Share
Facebook Twitter LinkedIn Pinterest Email



Daniel Villa's Avatar

Recently, Argilla and Hugging Face’s launch data have been excellent together. This is an experiment that builds a preferred dataset of prompt ranking together. A few days later, we are: Over 11,000 rapid ratings, 350 community contributors label their data

Check out our ongoing dashboard for the latest statistics!

This has resulted in the release of 10K_PROMPTS_RANKED. This is a dataset consisting of 10,000 user-rated prompts for prompt quality. I would like to enable more projects like this!

In this post, we explain why it is essential for communities to collaborate in building their datasets and share invitations to participate in the community’s first cohort.

Data is essential for a better model

Data continues to be essential for a better model. Publicly published research, open source experiments, and ongoing evidence from the open source community confirm that better data could lead to better models.

Screenshot of a hugging facehub dataset
question.

Screenshot of a hugging facehub dataset
Frequent answers.

Why build datasets together?

Data is essential for machine learning, but many languages, domains and tasks still lack high-quality datasets for training, assessment, and benchmarking. The community shares thousands of models, datasets and demos every day via a hug facehub. The collaboration has resulted in the open access AI community being incredible. Allowing communities to collectively build datasets unlocks unique opportunities for building next-generation datasets and building next-generation models.

By enabling communities to collectively build and improve datasets, people can:

It contributes to the development of open source ML, which requires ML and programming skills. Create a chat dataset for a specific language. Develop benchmark datasets for specific domains. Create a preferred dataset from a variety of participants. Creates a dataset for a specific task. Together as a community, we build a whole new type of dataset.

Importantly, we believe that by building datasets together, it will allow the community to build better datasets, which can be useful for people who don’t know how to code to contribute to AI development.

Make it easier for people to contribute

One of the challenges to many previous efforts to build AI datasets together was to set up efficient annotation tasks. Argilla is an open source tool that helps you create datasets for LLMS and smaller, specialized task-specific models. Embracing Facespace is a platform for building and hosting machine learning demos and applications. Recently, Argilla added support for authentication via hugging face accounts for Argilla instances hosted in the space. This means it takes a few seconds for the user to start contributing to annotation tasks.

I stress-tested this new workflow when I created the 10K_PROMPTS_RANKED DATASET, so I want to support the community as I launch my new aggregation dataset effort.

Join the first cohort of communities who want to build a better dataset together!

I’m very excited about the possibility that it’s locked up by this new simple flow to host annotation tasks. To support the community in building better datasets, we embrace faces and Aguilla and invite interested people and communities to participate in the first cohort of Community Dataset Builders.

The people participating in this cohort are:

Supports facial recognition to hug and create an Argira space. Hugging your face gives you free permanent storage and improves CPU space for participants. Comms and Promising promotes initiatives amplified by Argilla and Hugging Face. Please be invited to join our Cohort Community Channel

Our goal is to support our community as we build better datasets together. We embrace many ideas and want to support our community as much as possible in building a better dataset together.

What kind of projects are you looking for?

We are open to supporting many types of projects, especially those in the existing open source community. We are particularly interested in projects that focus on building datasets for languages, domains, and tasks that are currently underrepresented in the open source community. The only limitation at the moment is that it focuses primarily on text-based datasets. If you have a very cool idea about a multimodal dataset, we look forward to hearing from you, but we may not be able to support you in this first cohort.

Tasks can be fully open or open to members of a particular embracing facehub organization.

If you would like to participate in your first cohort, please join our #Data-IS-Better-Together channel for Hugging Face Disparities. Let us know what you would like to build together!

We look forward to building a better dataset with you!

author avatar
versatileai
See Full Bio
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleWebinar Report: Disformation and AI – Towards Media Educathon
Next Article AI Art Challenge: Everyday Giants will showcase the creativity of AI generated in 2025 | AI News Details
versatileai

Related Posts

Tools

Research shows that AI can reduce global carbon emissions

July 3, 2025
Tools

How much more jointly can a multimodal model be inferred than text-and-images in a rich scene?

July 2, 2025
Tools

Unlocking conversion of web screenshots to HTML code using WebSight dataset

July 1, 2025
Add A Comment

Comments are closed.

Top Posts

Impact International | EU AI ACT Enforcement: Business Transparency and Human Rights Impact in 2025

June 2, 20251 Views

Presight plans to expand its AI business internationally

April 14, 20251 Views

PlanetScale Vectors GA: MySQL and AI Database Game Changer

April 14, 20251 Views
Stay In Touch
  • YouTube
  • TikTok
  • Twitter
  • Instagram
  • Threads
Latest Reviews

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

Most Popular

Impact International | EU AI ACT Enforcement: Business Transparency and Human Rights Impact in 2025

June 2, 20251 Views

Presight plans to expand its AI business internationally

April 14, 20251 Views

PlanetScale Vectors GA: MySQL and AI Database Game Changer

April 14, 20251 Views
Don't Miss

CAC has announced AI-powered business registration portal – thisdaylive

July 3, 2025

Research shows that AI can reduce global carbon emissions

July 3, 2025

AI Art Challenge: Everyday Giants will showcase the creativity of AI generated in 2025 | AI News Details

July 2, 2025
Service Area
X (Twitter) Instagram YouTube TikTok Threads RSS
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2025 Versa AI Hub. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?