Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Do you really want to delete this prezi?
Neither you, nor the coeditors you shared it with will be able to recover it again.
Make your likes visible on Facebook?
You can change this under Settings & Account at any time.
What the heck is UTF-8?
Transcript of What the heck is UTF-8?
ASCII encoder ASCII decoder But what is ASCII? ASCII code was developed in the days of telegraph to encode:
letters, numbers, symbols (and some other stuff) into simple binary code.
It used 7-bits (ones and zeros) to represent each character. That's my boy! ASCII is #$!?*%@ great… but now I want to travel the world! Be careful out there son, and ahhm, watch your language! !!?? Je m'appelle Amélie, je suis française.
Tu veux du café? I am so s.m.r.t. In Europe you're going to need more bits. Come inside to extend your ASCII! ISO-8859-1 (Latin-1)
Covers most of the other characters needed to express European Lanuages. By adding an extra bit, you can have twice as many characters. Ich bin so Europäischen…
C'est génial. ¿Qué te parece? ¡¿?! ¡Oh, no! I see you're confused. Worry not, for Unicode will help you. I want to go home! Unicode A huge list of all characters known to man, woman and android-kind! Hmm… I knew I was going to need more bits! Start off with basic ASCII… …continue with ISO-8859-1… …then add everything else… Character sets needed so far:
ISO-8859-1 Character sets learnt so far:
ISO-8859-1 Fully Supported:
English (UK and US)
Irish (new orthography)
Kurdish (The Kurdish Unified Alphabet)
Latin (basic classical orthography)
Luxembourgish (basic classical orthography)
Norwegian (Bokmål and Nynorsk)
Portuguese (Portuguese [European] and Brazilian)
Walloon Almost fully supported:
Irish (traditional orthography)
Latin with macrons
Welsh HEALTH WARNING
To prevent bloating, you'll need to use a clever way to represent Unicode.
I suggest UTF! Hello! 010010000110010101101100011011000110111100100001 I'm going to need more bits! 48 bits (6 bytes) ISO-8859-1 Raw UNICODE* Hello! 000000000000010010000000000000000110010100000000000001101100000000000000011011000000000000000110111100000000000000100001 120 bits (15 bytes) (All you need is 20-bits) Never really done! Hmm… this Unicode seems like a waste of space! How is it possible to:
represent ALL UNICODE characters
Be backwards compatible with ASCII
Keep file sizes sensible ? 0 1 Rebellious teenager Café C = 67 = 01000011
a = 97 = 01100001
f = 102 = 01100110
é = 233 = 11101001
→ 11000011 10101001 UNICODE Positions UTF-8 Translation 0100001101100001011001101100001110101001 40 bits (5 bytes) Ah, that's better That's the power of UTF-8 Now you know! Just use 8 bits for the original ASCII characters. Then do some really clever stuff to represent all the rest. That'll keep American Dad happy! Universal Character Set (UCS) transformation format - 8 bit Character sets needed so far:
ISO 8859-5 (Cyrillic) Character sets needed so far:
ISO 8859-5 (Cyrillic)
ISO 8859-7 (Greek) Character sets needed so far:
ISO 8859-5 (Cyrillic)
ISO 8859-7 (Greek)