

Whereas Linux distributions mostly switched to UTF-8 in 2004, Microsoft Windows generally uses UTF-16, and sometimes uses 8-bit code pages for text files in different languages. The differing default settings between computers are in part due to differing deployments of Unicode among operating system families, and partly the legacy encodings' specializations for different writing systems of human languages. A major source of trouble are communication protocols that rely on settings on each computer rather than sending or storing metadata together with the data. Mojibake is often seen with text data that have been tagged with a wrong encoding it may not even be tagged at all, but moved between computers with different default encodings. As mojibake is the instance of non-compliance between these, it can be achieved by manipulating the data itself, or just relabeling it.

To correctly reproduce the original text that was encoded, the correspondence between the encoded data and the notion of its encoding must be preserved. The word is composed of 文字 (moji, IPA: ), "character" and 化け (bake, IPA:, pronounced "bah-keh"), "transform". Mojibake means "character transformation" in Japanese. 4.3.3 Russian and other Cyrillic alphabets.4 Problems in different writing systems.
#Pirate fonts with greek letters software

The UTF-8-encoded Japanese Wikipedia article for Mojibake as displayed if interpreted as Windows-1252 encoding
