Earlier this month, the Unicode Consortium–a nonprofit made up of member companies including Apple, Google, IBM, Microsoft, and more–announced a new version of the Unicode Standard that would bring more than 250 new emoji to people's devices in the near future. Included in the list of new emoji are a golfer, a racing motorcycle, a beach with an umbrella, and a derelict house building. Why were these emoji chosen, and not others?

The simple answer is: they weren't chosen, not really. Any child at bedtime asking his or her mother "Mommy, where do emoji come from?" will no doubt go to sleep disappointed. There is no committee that decides that there needs to be a chipmunk emoji, or that an aardvark emoji would just be beyond the pale. Rather, emoji–like language itself–has a life of its own.

Numbers First

The first thing to understand is that computers don't really understand text. They only understand numbers. When you send a message on your smartphone, you aren't actually sending text to someone. Your smartphone is taking a message, breaking it down into a sequence of numbers (called bytes), and then beaming them to another smartphone, where those numbers are then shown to you as text characters, thanks to fonts.


This system is called Unicode, and it's a sort of human-to-computer Rosetta Stone. It's an encoding standard that makes sure the message sent from your iPhone in America can be read on an Android phone in Argentina or a Windows Phone in Siberia. Text shown on different devices might have different typefaces and font sizes, but the actual meaning will be the same.

As part of the standard, the Unicode Consortium maintains a giant database of international symbols, each of which corresponds to a unique number a computer can understand. Letters, numbers, and punctuation marks are part of this database, but Unicode also contains many other symbols, such as the glyphs used to transcribe Chinese, or pictographs, like emoji. Think of it like a giant reference chart, with bytes on one side, and a pictorial representation of a character on the other, and you've got the right idea.

What Gets Added To The Unicode Database?

Not just any character or symbol can get added to the Unicode database. Instead, every petition for a new symbol has to undergo a complicated vetting process. Speaking to Co.Design, Mark Davis, president of the Unicode Consortium, the major criterion for determining whether or not a new character or symbol is added to the standard is if it's already being used extensively in text-based communication: for example, in analog print, or in writing.

"It has to be in the wild already," says Davis. You can't just design a new character and submit it to Unicode for approval: you need to basically prove that the Unicode standard has a hole in it without that character, because people are already using it to communicate every single day. Even if you do prove it, though, getting the character adopted can take years. For an extreme example, consider Egyptian hieroglyphics. Although they have been used for thousands of years, and scholars write about them every day, they were only added to the Unicode standard in 2010.

Given the above criteria, it seems incredible that Unicode has as extensive an emoji library as it does. Emoji are fun, but are they essential? Apparently, yes.

Emoji Are Essential To Communication

The explanation for why Unicode supports cartoon hot dogs, piles of poo and raspberrying ghosts at all is actually fairly straightforward: Emoji were proven to be essential. Although emoji weren't officially part of the Unicode Standard until 2010, the colorful cartoon symbols have been a major part of Japanese smartphone culture since 1998, when they debuted as a cute software feature on local phones. Pretty soon, millions of Japanese phones across multiple carriers came with huge emoji libraries pre-installed.


The problem, though, was that phones outside of Japan didn't understand emoji, because Unicode didn't support them. If a Japanese kid sent an American friend a message with an emoji, his phone would just cough up some gibberish. For hardware and software makers, this meant that if they wanted their devices to support emoji, they couldn't rely upon the Unicode Standard. They had to hack in support for emoji, obviating the point of adopting Unicode in the first place.

In 2010, Unicode revealed the 6.0 version of the standard, including a library of 722 emoji that were common to all three of the major cell phone carriers in Japan. Unicode didn't design or create any of these emoji. In fact, these emoji often look very different from one device to another, thanks to the fact that emojis, like letters, come in fonts. The reason that they exist at all in the Unicode Standard, though, is because they spread virally through Japan and into the rest of the world, despite the fact that Unicode didn't support them, largely because big companies like Apple and Google added their own support.

The "new" emoji being added to Unicode as part of the 7.0 standard are actually even older. In fact, they are mostly made up of symbols that have been in use since 1990 as part of Microsoft's Wingdings and Webdings fonts, which ship with every version of Microsoft Office. These emoji have spent the better part of a quarter century being used every day before becoming standard.

The truth is that adding new emoji to Unicode isn't that much different than adding a 27th letter to the alphabet. First, you've got to use it yourself. Then, you've got to get other people using it. And finally, you have to prove to experts that the alphabet has a hole in it without it. That might be enough to make an amateur emoji designer despair, but the fact that every smartphone on Earth now ships with a character representing a pile of anthropomorphic poo on it proves that it can be done.