Please enable JavaScript.
Coggle requires JavaScript to display documents.
Characters - Coggle Diagram
Characters
Python
-
-
-
print(int(“number”, base)) # prints the denary version of the number (you must specify which base the number is in)
-
-
Unicode
-
The most common version uses 16 bits for encoding - that give us 2^16 possibilities (65,536).
Unicode was developed to account for every language in the world. It is a standard character set - all computers use it.
Code page
-
For example, code page 1253 within Unicode represents the Greek language. If the person in Greece has a Greek keyboard and the Greek codepage is selected within the operating system then the correct characters appear on the screen when they press a certain key on the keyboard.
-
ASCII
This code is called ASCII(American Standard Code for Information Interchange) and is used to allow the computer to understand the characters that have been typed in by a human. The word ‘Computing’ uses the denary codes: 67 111 109 112 117 116 105 110 103
Obviously the computer would recognise these in Binary as:
01000011 01101111 01101101 01110000 01110101 01110100 01101001 01101110 01100111
Each character is given a unique binary code and that is how the computer can represent the correct character.
You will notice that each character is stored in 8 bits but only uses 7 bits for encoding (representing characters). 7 bits gives 27 possible combinations = 128 characters
Problems with ASCII
-
-
But it cannot represent characters from non-English alphabets and it can only represent a limited number of symbols.
-
Character set
The term Character set is used to describe the possible characters that can be represented in a computer system.
A character set is not a font or a typeface. The character set tells the computer what a character is - whether it's a 3 or a T. The font or typeface determines how those characters look, but a 3 is a 3 whether it's in bold or italic, Times New Roman or Arial.
-
-
Extended Unicode
Uses 21 bits for encoding which is enough for ancient languages such as Egyptian Hieroglyphics and even emojis.
-
relationship between number of bits used for encoding and number of characters that can be
represented
Character sets: relationship between the number of bits used for encoding and the number of characters that can be represented = 2^number of bits
-