A-level Computing 2009/AQA/Problem Solving, Programming, Data Representation and Practical Exercise/Fundamentals of Data Representation/Unicode



 The problem with ASCII is that it only allows you to represent a small number of characters (~128 or 256 for Extended ASCII). This might be OK if you are living in an English speaking country, but what happens if you live in a country that uses a different character set? For example: You can see that we quickly run into trouble as ASCII can't possibly store these hundreds of thousands of extra characters in just 7 bits. What we use instead is unicode. There are several versions of unicode, each with using a different number of bits to store data: With over a million possible characters we should be able to store every character from every language on the planet, take a look at these examples: You can find out more about unicode encoding on Wikipedia
 * Chinese characters 汉字
 * Japanese characters 漢字
 * Cyrillic Кири́ллица
 * Gujarati ગુજરાતી
 * Urdu اردو

100 0111 - as it is 3 characters further on in the alphabet

110 1101 - as it is 6 characters down in the alphabet

Each character only takes up 8 bits, meaning that storing data in ASCII may take up less memory than unicode

ASCII stores a much smaller character set than unicode, meaning that you are limited to the Latin character set and cannot represent characters from other languages.

2^7 = 128

unicode as it would allow you to display non Latin character sets such as Hindi and Cyrillic