Migrate hubs from Git LFS to Xet

In January this year, the Xet team at Hugging Face deployed a new storage backend, shifting 6% of hub downloads from infrastructure. This represents an important milestone, but it was just the beginning. In six months, 500,000 repositories holding 20 pbs joined the migration to Xet, with the hub moving above Git LFS and moving to storage systems that scale with AI builder workloads.

Today, more than 1 million people at Hub use Xet. In May, it became the default for new users and organization hubs. If you have just a few dozen Github issues, forum threads, and mismatch messages, this is probably the quietest migration of this magnitude.

how? One is the team prepared with many years of experience in building and supporting content addressed stores (CAS) and Rust clients that provide the foundation for the team. Without these pieces, Git LFS could still be the future of the hub. However, the nameless heroes of this transition are:

An integral part of the infrastructure known internally as a background content migration for Git LFS Bridges

Together, these components allowed us to actively migrate PBS during the day without worrying about impact on hubs and communities. They are giving us a mind that moves even faster in the coming weeks and months (skip to the end to see what’s coming).

Bridge and backward compatibility

In the early days of planning our transition to Xet, we made some important design decisions.

There is no “hard cutover” from git lfs to xet xet A xet-enabled repository cannot include a repository repository from XET to XET. This means you can run it in the background without confusing downloads or uploads

These seemingly outspoken decisions driven by our commitment to the community were important. Most importantly, we didn’t think that users and teams would need to change their workflows immediately or download new clients to interact with XET-enabled repository.

If you have an XET AWARE client (HF-XET, XET integration with HuggingFace_Hub), upload and download the entire XET stack. The client either splits the file into chunks using the content that defines the chunking defined during the upload, or requests file reconfiguration information when downloaded. Uploading will pass the chunks to the CAS and save them in S3. During download, CAS provides the chunk range that the client needs to request from S3 and rebuilds the file locally.

For older versions of Huggingface_hub or Huggingface.js that do not support chunk-based file transfer, you can download and upload it to Xet Repos, but these bytes take a different route. When Xet aid files are requested from a hub along the Resolve Endpoint, the Git LFS bridge will mimic the LFS protocol and construct and return a single preprocessing URL. The bridge then works to rebuild the file from the content held in S3 and returns it to the requester.

A very simplified view of GIT LFS bridges – in fact, this path includes several more API calls and components, such as a bridge facing CDN, DynamoDB for file metadata, and S3 itself.

To actually see this, right-click on the image above and open it in a new tab. The URL will be redirected from https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/migrating-hub-to-xet/bridge.png. A URL to check for redirection to the device.

Meanwhile, when a non-Xet-enabled client uploads a file it was first sent to LFS storage and migrated to Xet. This “background migration process” is simply mentioned in the documentation and enhances both the migration to XET and the upload of backward compatibility. This is behind a migration far beyond the numerous PBSs of models and datasets, keeping 500,000 repos in sync with XET storage without missing beats.

Whenever a file needs to be migrated from LFS to XET, a webhook is triggered and the event is pushed into a distributed queue where it is processed by the orchestrator. Orchestrator:

If the event requests, enable Xet in the repository. When you get a list of LFS revisions for all LFS files in the repo van, the file batches the files into a job based on size or number of files. You may also place the job first in another queue in the Migration Worker Pod, either in 1000 files or 500MB.

These migrant workers pick up jobs and each pod.

When you download the LFS file listed in the batch, it uploads the LFS file to the XET content store using Xet-Core

Migration Flow — Migration flow triggered by a webhook event. It starts with an orchestrator to keep it simple.

Scaling transition

In April, I tested the limitations of this system by contacting Bartowski and asking if I wanted to test Xet. The Bartowski migration revealed some weak links, close to 500 TB to 2,000 repositories.

The global DEDUPE temporary shard file was first written to /TMP and then moved to the shard cache. However, in the worker pod, the /TMP and XET caches were sitting at different mount points. The move failed and the shard file was not deleted. The disk was eventually filled and triggered a wave with no space left in the device error. After supporting the launch of the Llama 4, they scaled the CAS for Bursty downloads, but migrant workers flipped the script as hundreds of multi-gigabyte uploads pushed the CAS beyond paper resources. Pod profiling revealed networks and EBS I/O bottlenecks

Correcting these three monsters means touching all layers. Patch Xet-Core, resize CAS, and enhance worker node specifications. Luckily, Bartowski was a game that worked with us, but all the repositories headed for Xet. These same lessons have driven the movement of the biggest storage users on hubs, such as Richarderkhov (1.7pb and 25,000 repositories) and Mradermacher (6.1pb and 42,000 repositories).

Meanwhile, CAS throughput has grown by several orders of magnitude between the first and most recent large-scale transitions.

Bartowski Migration: CAS maintained ~35 Gb/s and ~5 Gb/s arrived from normal hub traffic. Migration of Mradermacher and Richarderkhov: CAS peaks at around 300 Gb/s, yet still weighs daily loads at 40 GB/s.

Zero friction, faster transfer

When I started swapping my LFS, I had two goals in mind.

Do not harm

By designing using our initial constraints and these goals, we have:

Before you include it in huggingface_hub, you will learn how to deploy and deploy HF -xet to handle downloads and downloads from Xet-enabled repositories, from scaling to how clients work in client distribution file systems, through any means used by the community for necessary dependency support, from how the client works, from Hub to Xet, as the infrastructure handles rest.

Instead of waiting for all upload paths to become Xet-Aware, instead of forcing a hard cutover or pushing the community to adopt a specific workflow, you can start migrating your hub to XET immediately with minimal user impact. In short, teams are organically migrating to Xet with infrastructure that maintains workflows and supports the long-term goals of a unified storage system.

xet for everyone

In January and February, we onboarded power users to provide feedback and pressure test their infrastructure. To get community feedback, I launched a waitlist that previews Xet-enabled repository. Soon after that, Xet became the default for new users in the hub.

Currently, we support some of the biggest creators at the hub (Meta Llama, Google, Openai, Qwen), but our community continues to work uninterruptedly.

What’s next?

Starting this month, we’ve been bringing Xet to everyone. Beware of emails that provide access to Xet, and then when you get it, update to the latest HuggingFace_Hub (PIP Install -U Huggingface_Hub) to immediately unlock faster forwarding. This also means:

All existing repositories migrate from LFS to Xet All newly created repositories are Xet-enabled by default

It’s fine if you use a browser to upload or download from a hub or use Git. Both chunk-based support will soon be here. In the meantime, use the workflow you already have. There are no restrictions.

Next, open source the entire XET protocol and infrastructure stack. The future of storing and moving bytes scaled to AI workloads is on the hub, and we aim to bring it to everyone.

If you have any questions, please see one line in the comments. Open the discussion on the XET Team page.

versatileai

See Full Bio

What's Hot

Google AI tools accurately identify genetic causes of cancer

Run VLM on Intel CPUs in 3 easy steps

Epidemic Sound launches AI-powered soundtracking assistant for creators and brands

Google AI tools accurately identify genetic causes of cancer

Run VLM on Intel CPUs in 3 easy steps

Bringing AI to the next generation of fusion energy

🤗 Overview of quantization schemes natively supported by Transformers

Corteva, Profluent partners use AI to enable more resilient crops

Hugging Face and FriendliAI partner to enhance model deployment in hubs

Most Popular

🤗 Overview of quantization schemes natively supported by Transformers

Corteva, Profluent partners use AI to enable more resilient crops

Hugging Face and FriendliAI partner to enhance model deployment in hubs

Don't Miss

Google AI tools accurately identify genetic causes of cancer

Run VLM on Intel CPUs in 3 easy steps

Epidemic Sound launches AI-powered soundtracking assistant for creators and brands

Subscribe to Updates

What's Hot

Migrate hubs from Git LFS to Xet

Bridge and backward compatibility

Scaling transition

Zero friction, faster transfer

xet for everyone

Related Posts