The Hugging Face Hub is a huge repository, currently hosting over 750k public models, offering a variety of pre-trained models for a variety of machine learning frameworks. Of these, 346,268 (at the time of writing) models are built using popular trance libraries. The Kerashub Library recently added integration with hubs that are compatible with the first batch of 33 models.
In this first version, Kerashub users were limited to only Kerashub-based models available on Hugging Face Hub.
from keras_hub.models Import gemmacausallm gemma_lm = gemmacausallm.from_preset(
“hf:// google/gemma-2b-keras”
))
They were able to train/fine tweak the model and upload it to the hub (note that the model is still a Keras model).
model.save_to_preset(“./gemma-2b-finetune”) keras_hub.upload_preset(
“hf:// username/gemma-2b-finetune”,
“./gemma-2b-finetune”
))
They had missed an extensive collection of over 300,000 models created in the Trans Library. Figure 1 shows the 4K Gemma model of the hub.
However, what if you can access and use these 300K+ models using Kerashub, what if you could significantly expand your model selection and functionality?
from keras_hub.models Import gemmacausallm gemma_lm = gemmacausallm.from_preset(
“hf:// google/gemma-2b”
))
We are thrilled to announce important advancements in the hub community. Transformers and Kerashub are now available as a storage format for shared models. This means that the model of the transformer library for the Hagging Face Hub can also be loaded directly into Kerashub. Initially, the integration will focus on enabling the use of Gemma (1 and 2), Llama 3, and Paligemma models, and plans to expand compatibility to a wider range of architectures in the near future.
Use a wider range of frameworks
Because Kerashub models can seamlessly use Tensorflow, Jax, or Pytorch backends, this means that a single codeline can load a huge range of model checkpoints into any of these frameworks. I found a great checkpoint on the embraced face, but would you like to be able to deploy it to Tflite for a serving or port it to Jax for research? Now you can!
How to use it
To use the integration, you need to update the Keras version
$ pip install -u -q keras -hub $ pip install -u keras> = 3.3.3
Once updated, it’s easy to try out the integration as follows:
from keras_hub.models Import llama3causallm Cause_lm = llama3causallm.from_preset(
“hf:// nousresearch/hermes-2-pro-llama-3-8b”
)caseal_lm.summary()
Under the hood: how it works
Transmodels are stored as a set of configuration files in JSON format, and are token agents (usually .JSON files), and a series of safetenser weight files. The actual modeling code is included in the transformer library itself. This means that crossloading transformer checkpoints to Kerashub is relatively easy, as long as both libraries have modeling code for the relevant architecture. All you need to do is map the configuration variables, weight names, and talknaser vocabulary from one format to another, from trans checkpoints to Kerashub checkpoints and vice versa.
All of this is done internally, so you can focus on trying out the model rather than transforming it.
Common Use Cases
generation
The first use case for the language model is to generate text. Here is an example of using Kerashub’s .generate method to load a trans model and generate a new token:
from keras_hub.models Import llama3causallm Cause_lm = llama3causallm.from_preset(
“hf:// nousresearch/hermes-2-pro-llama-3-8b”
) Prompt = (
“” “<| im_start |>system
You taught me and helped me, so sensory, close artificial general information.<| im_end |>
<| im_start |>user
Writing a short story about Kirby’s discovery, Kirby collaborated with Majin Buu to destroy the world.<| im_end |>
<| im_start |>Assistant “” “,) dasal_lm.generate(prompts, max_length =200) ()0))
Changing accuracy
You can use keras.config to change the accuracy of your model.
Import keras keras.config.set_dtype_policy(“bfloat16”))
from keras_hub.models Import llama3causallm Cause_lm = llama3causallm.from_preset(
“hf:// nousresearch/hermes-2-pro-llama-3-8b”
))
Use checkpoints in the JAX backend
To test your model using JAX, you can leverage Keras to run your model on the Jax backend. This can be achieved simply by switching the Keras backend to Jax. Here’s how to use the model within a JAX environment:
Import os os.environ (“keras_backend”)= “Jacks”
from keras_hub.models Import llama3causallm Cause_lm = llama3causallm.from_preset(
“hf:// nousresearch/hermes-2-pro-llama-3-8b”
))
Gemma 2
We would like to inform you that the Gemma 2 models are also compatible with this integration.
from keras_hub.models Import gemmacausallm causal_lm = keras_hub.models.gemmacausallm.from_preset(
“hf:// google/gemma-2-9b”
))
Parigenma
You can also use the Paligemma SafeTensor checkpoint with Kerashub Pipeline.
from keras_hub.models Import paligemmacausallm pali_gemma_lm = paligemmacausallm.from_preset(
“hf:// gokaygokay/sd3-long-captioner”
))
What’s next?
This is just the beginning. It is intended to expand this integration to envelop a wider range of face models and architectures. Stay tuned for the latest news and be sure to investigate the incredible possibilities for this collaboration to be unlocked!
I would like to use this opportunity to thank Matthew Currigan and Matthew Watson for helping me throughout the process.