It’s Tuesday morning. As a Transformers maintainer, I do the same thing most weekday mornings. I open PyCharm, load the Transformers codebase, and look fondly at the chat template documentation while ignoring the 50 user issues I’ve been pinged that day. But something feels different this time.
Something…wait! computer! Strengthen!
Is that…?
These user questions are definitely unanswered today. Let’s talk about integrating Hugging Face in PyCharm.
The hugging faces are in your house
I could introduce this integration by just listing the features, but that would be tedious, so here’s the documentation for that. Instead, let’s look at how to actually use all of this. Let’s say I’m writing a Python app and decide I want it to be able to chat with users. However, in addition to text chatting, we also want users to be able to paste images, and we want the app to naturally chat about images as well.
If you’re not very familiar with the current state of the art in machine learning, this may seem like a scary request, but don’t worry. Just right-click in the code and select Insert HF Model. A dialog box will appear.
Chats that use both images and text are called “image-text-to-text.” The user specifies an image and text, and the model outputs the text. Scroll down the left side until you find it. By default, the list of models is sorted by likes. However, keep in mind that older models often accumulate a lot of likes even if they are not actually cutting edge. You can see how old your model is by checking the last updated date just below the model name. Let’s choose the recent and popular one: microsoft/Phi-3.5-vision-instruct.
Selecting (Use Model) in some model categories allows you to automatically paste basic code into your notebook, but in many cases you can scroll through the model cards on the right to get sample code. It is more effective to do so. The complete model card appears on the right side of the dialog box. This is exactly what you see on Hugging Face Hub. Let’s run this and paste it into our code.
Your office’s cybersecurity guy might complain that you’re copying random chunks of text from the internet and running with it without reading it all, but if that’s the case, Just call that person a nerd and carry on regardless. And lo and behold, we have a working model that allows us to have fun chatting about images. In this case, read and comment on a screenshot of a Microsoft slide deck. Feel free to play around with this example. Try your own chat and your own images. Once it’s working, just wrap this code in a class and you’re ready to incorporate it into your app. Get cutting-edge open source machine learning in 10 minutes without even opening a web browser.
These models can be large. If you experience memory errors, try using a GPU with more memory or reducing the sample code by 20. You can also remove device_map=”cuda” and instead place the model in CPU memory at the expense of speed.
instant model cards
Next, let’s change our perspective with a small scenario. Now let’s say you’re not the author of this code, but a colleague who has to review it. Maybe you’re the cybersecurity guy I mentioned earlier, and you’re still pissed off by the “nerd” comments. When I look at this snippet of code, I have no idea what I’m looking at. Don’t panic. Simply hover your mouse over the model name to instantly see the entire model card. You can quickly see the origin of this model and its intended use.
(This is also very useful if you’re working on something else and forget all about the code you wrote two weeks ago.)
local model cache
You may notice that you need to download the model the first time you run this code, but it loads much faster afterwards. The model is stored in a local cache. Remember that mysterious little 🤗 icon from earlier? Just click on it and it will show you all the lists in your cache.
This is a great way to find the models you’re currently working on and to save disk space by erasing models you no longer need. This is also very useful in a two-week memory loss scenario. If you can’t remember which model you had back then, it’s probably here. However, keep in mind that most practical production-ready models in 2024 will be over 1 GB, so the cache can fill up quickly.
Python in the age of AI
At Hugging Face, we tend to see open source AI as a natural extension of open source philosophy. Open software solves problems for developers and users, creates new functionality that developers and users can integrate into their code, and the open model: same. The implementation details are all so novel and exciting that it’s easy to get blinded by the complexity and focus too much on the implementation details, but models exist to do things for you. . If you abstract away the architecture and training details, they are essentially functions, tools in your code that transform a certain type of input into a certain type of output.
Therefore, these features are a natural fit. Just as the IDE already retrieves function signatures and documentation strings, you can also retrieve sample code and model cards for trained models. Such integration makes it easy to access and import chat or image recognition models just like you import any other library. We think it’s clear that this is the future of code, and we hope you find these features useful.
Download PyCharm and try out the Hugging Face integration.
Get 3 months of PyCharm subscription for free using the code PyCharm4HF here.