Unicode

From Higher Computing Science
Jump to: navigation, search

This article is unfinished. Please consider joining and adding to this article. Read about Page layout beforehand.

Key points

  • Unicode is an international version of ASCII that includes characters used in non-English alphabets. It uses 16 bits rather than 7 which gives 65536 possible characters.
  • Unicode characters take double the storage space of ASCII characters.

Information

In the 1980s, it became clear that ASCII codes would not cater for all computer users and the many character sets of the many languages used on computers. Programmers had been re-using the same character codes for different characters, and this meant that data could not be exchanged between computers using different character sets.

Unicode was agreed as a standard in 1991, and uses 16 bits to store characters, rather than ASCII's 8 bits. Indeed, Unicode can extend the use of bits to provide supplementary character sets. This means that Unicode can be extended, and can represent millions of different characters.

The program below will provide the unicode value for any character.

Videos

Further information

Test yourself

Teaching resources