Table of Contents#### Download Safari Books Online apps: Apple iOS | Android | BlackBerry

Entire Site

Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.

**1.1 BASIC CONCEPTS OF DATA REPRESENTATION**

The study of any aspect in computer science involves the processing of information. Data is defined as a raw fact but information is called processed data. A data value is a piece of data that we can consider as a single entity. We might consider the integer value 123 as a single value. If a data value can be decomposed into component parts, we call each part a component element. An atomic data value is a piece of data that we choose to consider as a single, non-decomposable entity. For example, the integer 45923 may be considered as a single decomposable entity. If we wish to decompose it into 4, 5, 9, 2, and 3, we may do so.

A natural level at which to stop the decomposition of data values stored in a digital storage medium is the bit. Logically, we may think of a bit as a data element that must have at any time one of the two values, and we will assign it the numeric values 0 and 1. Of course, we may decompose these if we wish. If the value is stored on a magnetic disc, for example, it is represented by an electromagnetic signal which is recorded on or in the disc surface. Taking the abstract point of view, we will ignore how the values are physically stored. We might think of this point as one boundary between hardware and software.

In computers the most widely used method for storing integers is binary number system. The base of this system is 2. Each bit position represents a power of 2 with a 2^{0} in LSB (least significant bit), 2^{1} next to LSB, and so on. For example, 10010 represents the integer n bit x 2° x 0 + 2^{1} x 1 + 2^{2} x 0 + 2^{3} x 0 + 2^{4} x 1 = 18. In this representation a string of n bit represents integer numbers between 0 and 2^{n} – 1. The negative binary numbers are stored in a two-complement form. Given n bits, the range of numbers that can be represented is –2^{(n-1)} to 2^{(n-1)} –1.

Real numbers, in computers, are stored in a floating-point notation. In this representation, a real number is expressed in two parts, mantissa and exponent. The base of an exponent is usually fixed, and the mantissa and exponent vary to represent different real numbers. For example, the decimal number 125.55 could be represented as 12,555 x l0^{–2}. The mantissa is 12,555 and the exponent is –2. The advantage of this representation is that it can be used to represent numbers with extremely large or extremely small absolute values. Usually in a 32-bit word length, 24 bits are reserved for mantissa and 8 bits for exponent. The size of mantissa and exponent depends on the machine configuration.

Data is not always interpreted numerically but is often stored in a non-numeric form. The number of bits necessary to represent a character in a particular computer is called the byte size and a group of bits of that number is called a byte. For character representation two types of code are normally used, American Standard Code for Information Interchange (ASCII) and Extended Binary Coded Decimal Interchange Code (EBCDIC). Both use a byte to represent a character. So 256 possible characters can be represented using these codes with a size of 1 byte. For example, in ASCII the capital letter ‘A’ is represented by the decimal number 65.

In computers, the internal representation of an integer or real or character is a string of bit pattern. For example, the bit string 01100110 can be interpreted as the number 66 (in binary coded decimal), which represents the character ‘B’. A method of interpreting a bit pattern is often called a data type. We use several data types such as binary, real, and so on, in the context of their representation in the computer. In the next section, we will describe the basic concept of data types related to abstraction of data.