WORKING WITH UNICODE
Python supports Unicode, which means that you can render characters in different languages. Unicode data can be stored and manipulated in the same way as strings. Create a Unicode string by prepending the letter u, as shown here:
>>> u'Hello from Python!' u'Hello from Python!'
Special characters can be included in a string by specifying their Unicode value. For example, the following Unicode string embeds a space (which has the Unicode value 0x0020) in a string:
>>> u'Hello\u0020from Python!' u'Hello from Python!'
Listing 1.1 displays the content of unicode1.py that illustrates how to display a string of characters in Japanese and another string of characters in Chinese (Mandarin).
LISTING 1.1: unicode1.py
chinese1 = u'\u5c07\u63a2\u8a0e HTML5 \u53ca\u5176\u4ed6'
hiragana = u'D3 \u306F \u304B\u3063\u3053\u3043\u3043 \u3067\u3059!'
print('Chinese:',chinese1)
print('Hiragana:',hiragana)
The output of Listing 1.2 is here:
Chinese: 將探討 HTML5 及其他 Hiragana: D3 は かっこぃぃ です!
The next portion of this chapter shows you how to “slice and dice” text strings with built-in Python functions.