It’s breathtaking. It’s a digital masterpiece. Why does that hand look like that?
OpenAI in July 2022; The artificial intelligence (AI) company has announced DALL-E 2, one of the first AI image generators widely available to the public. Users could type anything into the prompt, from “Beyoncé eating pizza” to “Renaissance portrait of a poodle” to “Statue of Liberty skateboarding,” and DALL-E 2 would respond with a corresponding set of images. However, the images produced by DALL-E 2 were incomplete, often distorted, and unrelated to the user’s prompts. And there was competition. Around the same time, two other AI companies, Stability AI and Midjourney, released their own image-generating AI programs. Stability AI launched Stable Diffusion, and Midjourney introduced a tool he named himself. By August, Midjourney’s AI image generator was so advanced that one of its images won an art contest at the state fair.
However, I started noticing a bug that recurs when users enter a prompt containing a person into one of these generators. Like many novice artists, I couldn’t draw hands with AI tools.
AI generated There may be a 9 in your hand Fingers sticking out from the palm. In some images, the hand appears to be floating away from the human body. Elsewhere, two or more hands are fused at the wrist.
why?
There are several reasons why AI has trouble handling hands and fingers. One is simply that the hand is a small part of the human body. In photos of real people, hands are usually not in focus. Notably, AI programs tend to have the same problems with human teeth and ears as with hands. AI-generated teeth are often small, dense, and even pointed, while ears are often depicted without lobes. Hands, teeth, and ears are all small and highly variable aspects of the human body. For example, if you scan a photo of someone missing a tooth, the AI might conclude that all smiles have the same gap. In a January 2023 interview with BuzzFeed News, a Stability AI spokesperson explained that “hands are less prominent in human images than faces within our AI dataset.” To successfully represent hands and fingers, the AI requires additional reference photos that primarily focus on hands.
Another problem is that the AI doesn’t actually know what the hand is. In a two-dimensional image, the hand appears in dozens of different positions. Hands are waving, flexing, grasping objects, clenching fists, and partially disappearing from trouser pockets. Humans know that these visual discrepancies represent how the hand works. The AI doesn’t have access to the three-dimensional world, so it only knows how the hand appears. Identifying a fist, thumbs up, or peace sign as a hand is an impressive feat for AI, and you can’t fault it for assuming that the actual hand might be a combination of the three.
Some users believe that AI-generated hand quirks are a feature rather than a bug. This anomaly often serves as an easy way to distinguish between real and AI-generated images. For example, a fake image of former President Donald Trump being arrested turns out to be an AI-generated image thanks to melting of a police officer’s hand. Inside Trump’s body. The same goes for photos from an alleged “extreme tanning competition,” where one contestant’s fingers looked more like hot dogs than fingers. Another contestant’s hand has at least seven fingers intertwined. The New Yorker wrote in March 2023: “Looking at the gnarled hands of an AI, we fall into the uncanny valley and experience a visceral revulsion…The failure of the machine is, in a way, comforting. ” he wrote. Perhaps AI can’t understand human hands, The New Yorker and BuzzFeed News wondered. Because AI can’t understand what it’s like to be human.
However, even if AI is able to view fighting with hands as a positive, this problem may not last for long. In March 2023, Midjourney released an update to the program aimed at making hands more realistic. Experts suspect that Midjourney adjusted the dataset to prioritize clear images of hands and deprioritize images where hands were obscured or only partially visible. While the resulting images are still not perfect (the aforementioned image of President Trump’s arrest was generated after the update), users generally agree that it’s an improvement. DALL-E, Stable Diffusion, and others may follow suit as artificial intelligence companies compete to have the best image generators on the market. It’s a race to the perfect prosthetic arm.