Close Menu
Versa AI hub
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

What's Hot

Reddit appeals to humanity over AI data scraping

June 6, 2025

Grassley discusses the AI ​​whistleblower protection law in a “start point” interview

June 5, 2025

Piclumen Art V1: Next Generation AI Image Generation Model Launches for Digital Creators | Flash News Details

June 5, 2025
Facebook X (Twitter) Instagram
Versa AI hubVersa AI hub
Friday, June 6
Facebook X (Twitter) Instagram
Login
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
Versa AI hub
Home»Tools»Hug face upload and download redesign
Tools

Hug face upload and download redesign

By November 28, 2024No Comments7 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
Share
Facebook Twitter LinkedIn Pinterest Email

As part of Hugging Face’s Xet team’s work to improve Hugging Face Hub’s storage backend, we analyzed 24 hours of Hugging Face upload requests to better understand access patterns. On October 11, 2024, there were 8.2 million upload requests and 130.8 TB of data transferred from 88 countries.

The map below visualizes this activity, with countries color-coded by bytes uploaded per hour.

Uploads are currently stored in an S3 bucket in us-east-1 and are optimized using S3 Transfer Acceleration. Downloads are cached and served using AWS Cloudfront as a CDN. Cloudfront’s 400+ convenient edge locations provide global coverage and low-latency data transfer. However, like most CDNs, it is optimized for web content and has a file size limit of 50 GB.

While this size limit is reasonable for typical Internet file transfers, it presents challenges as the file sizes in model and dataset repositories continue to grow. For example, meta-llama/Meta-Llama-3-70B has a total weight of 131 GB and is split into 30 files to meet the hub’s recommendation of chunking the weight into 20 GB segments. Additionally, enabling advanced deduplication or compression techniques on both uploads and downloads requires rethinking how file transfers are handled.

Custom protocols for upload and download

To push the Hugging Face infrastructure beyond its current limitations, we’re redesigning the hub’s upload and download architecture. We plan to insert a Content Address Store (CAS) as the first location for content delivery. This allows you to implement custom protocols built on the basic tenets of dumb reads and smart writes. Unlike Git LFS, which treats files as opaque blobs, our approach analyzes files at the byte level and reveals opportunities to improve transfer speeds for large files in model and dataset repositories.

The read path prioritizes simplicity and speed, ensuring high throughput with minimal latency. Requests for files are routed to the CAS server, which provides reconfiguration information. The data itself remains backed up by the S3 bucket in us-east-1, and AWS CloudFront continues to serve as the CDN for downloads.

The write path is more complex to optimize upload speed and provide additional security guarantees. Similar to reads, upload requests are routed to the CAS server, but instead of querying at the file level, you operate on chunks. Once a match is found, the CAS server instructs the client (for example, huggingface_hub) to transfer only the required (new) chunks. Chunks are validated by CAS before being uploaded to S3.

There are many implementation details to deal with, such as network constraints and storage overhead, which I will discuss in a future post. For now, let’s take a look at what the current reading is. The first diagram below shows the current read and write paths.

Old read and write sequence diagram
The reading is displayed on the left. The writing is on the right. Note that writes are sent directly to S3 without any intermediaries.

On the other hand, in the new design, reads follow the following path:

New read path for the proposed architecture
A new read path with a content address store (CAS) that provides reconfiguration information. Cloudfront will continue to function as a CDN.

Finally, the updated write path is:

New read path for the proposed architecture
A new write path with CAS speeds up and validates uploads. S3 continues to provide backing storage.

By managing files at the byte level, you can apply optimizations for different file formats. For example, we are looking at improving deduplication for Parquet files, and are currently looking at compressing tensor files (such as Safetensor), which could reduce upload speeds by 10-25%. As new formats emerge, we are uniquely positioned to develop further enhancements that improve the development experience in the hub.

This protocol also provides significant improvements for enterprise customers and power users. Inserting a control plane for file transfers ensures that no malicious or invalid data is uploaded. Operationally, uploads are no longer a black box. Enhanced telemetry provides an audit trail and detailed logging to help hub infrastructure teams identify and resolve issues quickly and efficiently.

Designed for global access

To support this custom protocol, you must determine the optimal geographic distribution of your CAS services. AWS Lambda@Edge was initially considered for its broad global coverage to minimize round-trip times. However, because it relies on Cloudfront triggers, it is no longer compatible with the updated upload path. Instead, we decided to deploy CAS nodes in a select few of AWS’s 34 Regions.

A closer look at the 24-hour window for S3 PUT requests identified global traffic patterns that reveal the distribution of data uploads to the hub. As expected, the majority of the activity comes from North America and Europe, with a continuous high volume of uploads throughout the day. The data also highlights a strong and growing presence in Asia. By focusing on these core regions, you can deploy CAS points of presence to balance storage and network resources while minimizing latency.

Upload Pareto chart

AWS offers 34 regions, and our goal is to keep infrastructure costs reasonable while maintaining a high user experience. Of the 88 countries shown in this snapshot, the Pareto chart above shows that the top 7 countries account for 80% of the bytes uploaded, and the top 20 countries account for 95% of the total uploads and requests. It shows that there is.

The US has emerged as a major source of upload traffic, necessitating PoPs in this region. In Europe, most activity is concentrated in midwestern countries (such as Luxembourg, the United Kingdom, and Germany), although there is some additional activity in Africa (particularly Algeria, Egypt, and South Africa). Upload traffic in Asia is primarily driven by Singapore, Hong Kong, Japan, and South Korea.

Using simple heuristics to distribute traffic, you can divide CAS coverage into three main areas:

us-east-1: Serving the Americas eu-west-3: Serving Europe, the Middle East, and Africa ap-southeast-1: Serving Asia and Oceania

This ends up being very effective. The US and Europe accounted for 78.4% of bytes uploaded, followed by Asia for 21.6%.

New AWS mapping

This regional breakdown provides a balanced load across the three CAS PoPs, with additional capacity for expansion in ap-southeast-1 and, if needed, in us-east-1 and eu-west-3. gives you the flexibility to scale up.

Based on expected traffic, we plan to allocate resources as follows:

us-east-1: 4 nodes eu-west-3: 4 nodes ap-southeast-1: 2 nodes

Verification and inspection

Increasing the first hop distance for some users has limited overall impact on the hub’s overall bandwidth. We estimate that the cumulative bandwidth for all uploads will decrease from 48.5 Mbps to 42.5 Mbps (a 12% reduction), but we expect the performance impact to be more than offset by other system optimizations. Masu.

We are currently working on moving our infrastructure into production by the end of 2024, starting with a single CAS in us-east-1. From there, we begin replicating the internal repository to the new storage system to benchmark transfer performance, and then replicate the CAS to the additional PoPs mentioned above for further benchmarking. Based on these results, we will continue to optimize our approach to ensure everything works smoothly when the storage backend is fully deployed next year.

Beyond part time job

As this analysis continues, new opportunities are emerging for deeper insights. Hugging Face hosts one of the largest collections of data from the open source machine learning community, providing a unique perspective to explore the techniques and trends driving AI development around the world.

For example, future analytics could categorize models uploaded to the hub by use case (NLP, computer vision, robotics, large-scale language models, etc.) and explore geographic trends in ML activity. This data not only informs infrastructure decisions, but also provides a lens into the evolving landscape of machine learning.

We invite you to learn more about our current research findings. Visit our interactive space to see Upload distributions in your area and follow the team to hear more about what we’re building.

author avatar
See Full Bio
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleRSC to research AI through funding to Creative Cluster
Next Article NCsoft launches three new studios and AI subsidiary

Related Posts

Tools

Reddit appeals to humanity over AI data scraping

June 6, 2025
Tools

AI enables the transition from enablement to strategic leadership

June 5, 2025
Tools

kv cache from scratch in nanovlm

June 4, 2025
Add A Comment
Leave A Reply Cancel Reply

Top Posts

New Star: Discover why 보니 is the future of AI art

February 26, 20253 Views

How to use Olympic coders locally for coding

March 21, 20252 Views

Dell, IBM and HPE must operate at a single digit margin when it comes to the server market, and only gets worse

March 10, 20252 Views
Stay In Touch
  • YouTube
  • TikTok
  • Twitter
  • Instagram
  • Threads
Latest Reviews

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

Most Popular

New Star: Discover why 보니 is the future of AI art

February 26, 20253 Views

How to use Olympic coders locally for coding

March 21, 20252 Views

Dell, IBM and HPE must operate at a single digit margin when it comes to the server market, and only gets worse

March 10, 20252 Views
Don't Miss

Reddit appeals to humanity over AI data scraping

June 6, 2025

Grassley discusses the AI ​​whistleblower protection law in a “start point” interview

June 5, 2025

Piclumen Art V1: Next Generation AI Image Generation Model Launches for Digital Creators | Flash News Details

June 5, 2025
Service Area
X (Twitter) Instagram YouTube TikTok Threads RSS
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2025 Versa AI Hub. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?