More ‘Biang’ for Your Buck in Localization with Multilingual Fonts

Font Issues in Localization

Biang Biang noodles have become wildly popular in the West thanks to a combination of social media attention and an eminently punnable name. The dish, which hails from Xi’an in Northwest China, is named to mimic the sound of dough as it is banged against a countertop, and the resulting hand-ripped noodles are served with lashings of piquant chili oil.

Apart from making a delicious meal, Biang Biang noodles or biángbiángmiàn are also famous for being named one of the most complex Chinese characters in modern usage. With the standard traditional form consisting of 58 strokes, it is little wonder that people have invented mnemonics to aid in remembering it.

Fortunately, ever since the simplified and traditional characters for biáng were encoded in Unicode in March 2020, one can get away with not learning how to write them.

Previously, there was no standardized way to represent the characters digitally. Workarounds such as substituting close phonetic equivalents, romanizing the name in pinyin, or even using images of the characters were common.

Substituting images or lookalike characters can work as a quick visual fix for human readers, but tends to cause issues when the text needs to be parsed by a computer. NLP applications ranging from speech synthesis to machine translation, spell checking, and content moderation all rely on correctly encoded text to work as expected and can be bamboozled by non-standard substitutions.

The Unicode Consortium maintains a widely-accepted standard for encoding, representing, and handling text. This non-profit seeks to include the “widest range of human languages” and already covers an impressive number of the world’s writing systems in its bid to support digitally-disadvantaged languages.

Back to biáng, we can begin to appreciate that Unicode support is just one step (albeit a crucial one) needed for robust multilingual support. Although both simplified and traditional characters have been added to Unicode, these often fail to display correctly. Depending on the configuration of your browser, operating system, and the fonts you have installed, they may appear as little boxes, question marks or other symbols, or even just blank space (and even when displayed correctly, it can be too dense to read in smaller font sizes). In this case, the computer is able to parse the text in question, but humans cannot – hardly a satisfying solution.

These sorts of font-rendering issues are precisely what the Noto Project, a collaborative effort by Google and Monotype, has sought to address. Its goal is to provide a freely available typeface for all scripts included in Unicode.

Short for “no tofu”, with ‘tofu’ being a nickname for the blank boxes that appear when an encoded character lacks font support, the Noto collection boasts the broadest multilingual coverage of any typeface. This makes it a handy option for designers seeking to accommodate localization across a large set of languages and writing systems, particularly those with limited font support.

This is precisely what global furniture giant Ikea did in 2019, adopting Noto as its custom typeface to standardize its visual identity across the dizzying number of markets it serves. Ideally, this focus on multilingual support will encourage other companies to place localization needs at the center of their communication design rather than treat it as an afterthought.

Ikea might not be swapping its Swedish meatballs for Biang Biang noodles anytime soon, but at least if they do, customers will be able to fully enjoy both the noodles and the name in its correctly rendered glory.