To keep the future of AI open, we are very excited to announce that GGML, the creators of Llama.cpp, are joining HF. 🔥
Georgi Gerganov and the team join HF with the goal of expanding and supporting the community behind ggml and llama.cpp as local AI continues to make significant advances over the next few years.
We’ve been working with Georgi and the team for quite some time (the team already includes core contributors to llama.cpp like Son and Alek), so this was a very natural process.
This is basically a match made in heaven, as llama.cpp is a fundamental building block for local inference, and transformers are a fundamental building block for model definition. ❤️
What will change for llama.cpp, the open source project, and the community?
Not that much – Georgi and the team still spend 100% of their time maintaining llama.cpp and have full autonomy and leadership over technical direction and the community. HF provides long-term sustainable resources for projects, increasing their chances of growth and success. The project will continue to be 100% open source and community-driven, as it is now.
technical focus
llama.cpp is the basic building block for local inference, and transformers are the basic building blocks for model and architecture definition. Therefore, we will work to make shipping new models in llama.cpp from the model-defined transformer library “source of truth” as seamless (almost “single-click”) as possible in the future.
Additionally, it also improves the packaging and user experience of ggml-based software. As we enter the stage where local inference becomes a meaningful and competitive alternative to cloud inference, it is important to improve and simplify how casual users deploy and access local models. We are working to make llama.cpp ubiquitous and easily available everywhere.
our long term vision
Our common goal is to provide the community with the building blocks to make open source superintelligence available to people around the world for years to come.
We will achieve this by collaborating with our growing local AI community as we continue to build the ultimate inference stack that runs as efficiently as possible on-device.
