Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.


  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • DownloadDownload
  • PrintPrint
Share this Page URL
Help

36. Unicode and Byte Strings > String Basics

String Basics

Before we look at any code, let’s begin with a general overview of Python’s string model. To understand why 3.0 changed the way it did on this front, we have to start with a brief look at how characters are actually represented in computers.

Character Encoding Schemes

Most programmers think of strings as series of characters used to represent textual data. The way characters are stored in a computer’s memory can vary, though, depending on what sort of character set must be recorded.

The ASCII standard was created in the U.S., and it defines many U.S. programmers’ notion of text strings. ASCII defines character codes from 0 through 127 and allows each character to be stored in one 8-bit byte (only 7 bits of which are actually used). For example, the ASCII standard maps the character 'a' to the integer value 97 (0x61 in hex), which is stored in a single byte in memory and files. If you wish to see how this works, Python’s ord built-in function gives the binary value for a character, and chr returns the character for a given integer code value:


  

You are currently reading a PREVIEW of this book.

                                                                                        

Get instant access to over
$1 million worth of books and videos.

  

Start a Free Trial


 Â