Close Menu
Versa AI hub
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

What's Hot

Grassley discusses the AI ​​whistleblower protection law in a “start point” interview

June 5, 2025

Piclumen Art V1: Next Generation AI Image Generation Model Launches for Digital Creators | Flash News Details

June 5, 2025

AI-Media revolutionizes Lightning International Partner’s fast channels

June 5, 2025
Facebook X (Twitter) Instagram
Versa AI hubVersa AI hub
Thursday, June 5
Facebook X (Twitter) Instagram
Login
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
Versa AI hub
Home»Tools»Introducing SQL consoles in your dataset
Tools

Introducing SQL consoles in your dataset

By March 3, 2025No Comments5 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
Share
Facebook Twitter LinkedIn Pinterest Email

Dataset usage is exploding, with faces becoming the default home for many datasets. As the amount of datasets uploaded to the hub increases each month, you will need to query, filter and discover them.


A data set created by hugging facehubs every month

I’m very excited to announce that I can directly execute SQL queries on my dataset with my embracing facehub!

Introducing the SQL Console for Datasets

All datasets will display a new SQL console badge. With just one click, you can open the SQL console and query that dataset.

Query Magpie-Ultra datasets for excellent, high-quality inference instructions.

All work is done in the browser and the console comes with some neat features.

100% local: The SQL console features DuckDB WASM, allowing you to query datasets without dependencies. Full DuckDB Syntax: DuckDB has full SQL syntax support and many built-in functions such as Regex, List, JSON, Embeddings. You can see that the DuckDB syntax is very similar to PostgreSQL. Export Results: You can export the results of a query to Parquet. Shareable: Allows you to share query results for public datasets with links.

How it works

Conversion of parquet

Most embracing face datasets are stored in Parquet, a cylindrical data format optimized for performance and storage efficiency. The embracing face and the SQL console dataset viewer loads data directly from the Parquet file of the dataset. Also, if the dataset is in a different format, the first 5GB will be automatically converted to parquet. You can find more information about Parquet Conversion Process in the Dataset Viewer Parquet API documentation.

Using the Parquet file, the SQL console creates views for querying based on dataset splitting and configuration.

duckdb wasm🦆

duckdb wasm is the engine that drives the SQL console. This is an in-process database engine that runs in a web assembly in a browser. No servers or backends are required.

By running it in a browser only, users provide maximum flexibility to query data without dependencies. It’s also very easy to share reproducible results with simple links.

“Does it work for large datasets?” The answer is “Yes!”.

This is a query for the OpenCo7/upvoteweb dataset with 12.6m rows in the Parquet transform.

Reddit movie suggestions

You can see that you received the results of a simple filter query in less than 3 seconds.

Queries take time based on the size of the dataset and the complexity of the query, but you’ll be surprised at how much you can do with the SQL console.

Like other technologies, there are limitations.

The SQL console works with many queries. However, the memory limit is ~3GB, so you may run out of memory and cannot process the query (hint: try to use a filter to reduce the amount of queries along with the limit along with the query). duckdb wasm is very powerful, but duckdb does not have full functionality. For example, duckdb wasm does not yet support hf:// protocols in datasets.

Example: Convert a dataset from Alpaca to a conversation

Now that we have introduced the SQL console, let’s look at some practical examples. When tweaking large language models (LLM), you often need to work with a variety of data formats. One particularly popular format is the conversational format where each line represents a multi-turn dialog between the user and the model. The SQL console helps you efficiently convert your data to this format. Let’s see how to convert an Alpaca dataset into a conversational format using SQL.

Typically, developers work on this task in Python’s preprocessing steps, but they can show you how to accomplish the same thing in less than 30 seconds using the SQL console.

In the above dataset, click on the SQL Console badge to open the SQL Console. You need to make sure that the following queries are automatically entered:

When you’re ready, click the (Run Query) button to run the query.

SQL

and
source_view As (
Select * from train )
Select
(struct_pack(“from”:= ‘user’“value” := case
when input teeth do not have null and input ! = ”
after that instruction || ‘\ n \ n’ || input
Other than that instruction
end
), struct_pack(“from”:= ‘assistant’“value” := output)) As conversation
from source_view
where instruction teeth do not have null
and output teeth do not have null;

In the query, you use the struct_pack function to create a new struct line for each conversation.

DuckDB has great documentation on struct data types and functions. Many datasets contain columns of JSON data. DuckDB provides the ability to easily parse and query these columns.

Alpaca to conversation

Once you have the results, you can download it as a parquet commemorative file. You can see what the final output below looks like:

Please give it a try!

As another example, you can try the SQL console query in SkunkWorksAI/Reasoning-0.01 to see instructions for over 10 inference steps.

SQL Snippets

DuckDB still has many use cases under investigation. I created an SQL snippet space to show you what you can do with the SQL console.

Here are some really interesting use cases we’ve found:

Remember, it’s one click to download the results of SQL as a donation file and use them in your dataset.

I’d like to hear what you think about SQL consoles. If you have any feedback, please comment on this post!

resource

author avatar
See Full Bio
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleHow a small team of liberal arts alumni at Deepseek transforms AI text generation in China
Next Article AI-Media, ECB enhances accessibility with live captions for cricket matches

Related Posts

Tools

AI enables the transition from enablement to strategic leadership

June 5, 2025
Tools

kv cache from scratch in nanovlm

June 4, 2025
Tools

Gemini 2.5 native audio features

June 4, 2025
Add A Comment
Leave A Reply Cancel Reply

Top Posts

New Star: Discover why 보니 is the future of AI art

February 26, 20253 Views

How to use Olympic coders locally for coding

March 21, 20252 Views

Dell, IBM and HPE must operate at a single digit margin when it comes to the server market, and only gets worse

March 10, 20252 Views
Stay In Touch
  • YouTube
  • TikTok
  • Twitter
  • Instagram
  • Threads
Latest Reviews

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

Most Popular

New Star: Discover why 보니 is the future of AI art

February 26, 20253 Views

How to use Olympic coders locally for coding

March 21, 20252 Views

Dell, IBM and HPE must operate at a single digit margin when it comes to the server market, and only gets worse

March 10, 20252 Views
Don't Miss

Grassley discusses the AI ​​whistleblower protection law in a “start point” interview

June 5, 2025

Piclumen Art V1: Next Generation AI Image Generation Model Launches for Digital Creators | Flash News Details

June 5, 2025

AI-Media revolutionizes Lightning International Partner’s fast channels

June 5, 2025
Service Area
X (Twitter) Instagram YouTube TikTok Threads RSS
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2025 Versa AI Hub. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?