Chapter One
DATA STORAGE
Chapter Summary
This chapter presents the rudiments of data storage within digital computers. It introduces the
basics of digital circuitry and how a simple flip-flop can be used to store a single bit. It then
discusses addressable memory cells and mass storage systems (magnetic disk, compact disks, and
flash memory). Having established this background, the chapter discusses how information (text,
numeric values, images, and sound) are encoded as bit patterns. The optional sections delve more
deeply into these topics by presenting the problems of overflow errors, truncation errors, error
detection and correction techniques, and data compression.
Comments
1. Perhaps the most important comment I can make about this chapter (and the next one as well) is
to explain its role in the chapters that follow. This involves the distinction between exposing
students to a subject and requiring them to master the material—a distinction that is at the heart of
the spirit in which the entire text was written. The intention of this chapter is to provide a realistic
exposure to a very important area of computer science. It is not necessary for the students to master
the material. All that is needed from this chapter in the remaining part of the book are the remnants
that remain from a brief exposure to the issues of data storage. Even if the course you teach requires
a mastery of these details or the development of manipulation skills, I encourage you to avoid
emphasizing bit manipulations and representation conversions. In particular, I urge you to avoid
becoming bogged down in the details of converting between base ten and binary notation. I can’t
think of anything that would be more boring for the students. (I apologize for stating my opinion.)
2. The “required” sections in this chapter cover the composition of main memory (as a background
for machine architecture in chapter 2 and data structures in chapter 8), the physical issues of
external data storage systems (in preparation for the subjects of file and database systems in chapter
9), and the rudiments of data encoding (that serves as a background for the subject of data types and
high-level language declaration statements in chapter 6). The optional sections explore the issues of
error handling, including transmission error detection and correction as well as the problem of
truncation and overflow errors resulting from numeric coding systems.
3. As mentioned in the preface of the text, there are several themes that run throughout the text, one
of which is the role of abstraction. I like to include this theme in my lecture in which I introduce flip-
flops. I end up with both flip-flop diagrams from the text on the board, and I emphasize that they
represent two ways of accomplishing the same task. I then draw a rectangle around each diagram
and erase the circuits within the rectangles leaving only the inputs, outputs, and rectangles
showing. At this point the two look identical. I think that this creates a strong visual image that
drives home the distinction between an abstract tool’s interface with the outside world and the
internal details of the tool.
This is a specific example of teaching several topics at the same time—in this case, the concepts
of abstraction and encapsulation are taught in the context of teaching digital circuits.
3
4. Don’t forget about the circuits in Appendix B. I used to have students who continued to record an
extra bit in the answer to a two’s complement addition problem when a carry occurred—even
though I had explained that all values in a two’s complement system were represented with the
same number of bits. Once I started presenting the addition circuit in Appendix B, this problem
disappeared. It gave the students a concrete understanding that the carry is thrown away. (Of
course, in a later course computing students will learn that it really isn’t thrown away but saved as
the "carry bit" for potential use in the future, but for now I ignore this.) I have also found that a good
exercise is to ask students to extend the circuit in Figure B.3 so that it produces an additional output
that indicates whether an overflow has occurred. For example, the output could be 1 in the case of
an overflow and 0 otherwise.
5. For most students, seeing the reality of the things they are told is a meaningful experience. For
this reason I often find it advantageous to demonstrate the distinction between numeric and
character data using a spreadsheet. I like to show them how the manipulation of large numbers can
lead to errors.
7. I have found that students respond well to hearing about CD and DVD storage systems, how
sound is encoded, and image representation systems such as GIF and JPEG. I have often used these
topics as a way of getting non-majors interested in technical issues.
8. For students not majoring in computer science, topics such as two's complement and floating-
point notation can get a bit dry. The main point for them to understand is that when information is
encoded, some information usually gets lost. This point can be made just as well using audio and
video, which are contexts that seem to be more interesting to the non-majors.
Answers to Chapter Review Problems
1. a. 0 b. 0 c. 0
2. a. upper input = 1, lower input = 0
b. upper input = either 0 or 1, lower input = 1
c. upper input = 1, lower input = 0
3. a. AND b. OR c. XOR
4. This is a flip-flop that is triggered by 0s rather than 1s. That is, temporarily changing the upper
input to 0 will establish an output of 1, whereas temporarily changing the lower input to 0 will
establish an output of 0. To obtain an equivalent circuit using NAND gates, simply replace each
AND-NOT gate pair with a NAND gate.
5. Address Contents
00 02
01 53
02 01
03 53
6. 256 using two hexadecimal digits (16 bits) , 65536 using four hexadecimal digits (32 bits).
7. a. 11001011 b. 01100111 c. 10101001 d. 00010000 e. 11111111
8. a. 0 b. 1 c. 1 d. 0
9. a. AAA b. CB7 c. 0EB
4
10. The image consists of 1024 x 1024 = 1,048,576 pixels and therefore 3 x 1,048,576 = 3,145,728 bytes,
or about 3MB. This means that about 86 images could be stored in the 256MB camera storage
system. (By comparing this to actual camera storage capacities, students can gain an appreciation
for the benefits of image compression techniques. Using them, a typical 256MB storage system can
hold as many as 300 images.)
11. 786,432. (Each pixel would require one memory cell.)
12. Data retrieval from main memory is much faster than from disk storage. Also data in main
memory can be referenced in byte-sized units rather than in large blocks. On the other hand, disk
storage systems have a larger capacity than main memory and the data stored on disk is less volatile
than that stored in main memory.
13. There are 70GB of material on the hard-disk drive. Each CD can hold no more than 700MB. Thus,
it will require at least 100 CDs to store all the material. That does not seem practical to me. On the
other hand, DVDs have capacities of about 4.7GB, meaning that only about 15 DVDs would be
required. This may still be impractical, but its a big improvement over CDs. (The real point of this
problem is to get students to think about storage capacities in a meaningful way.)
14. There would be about 5,000 characters on the page requiring two bytes each (Unicode). So the
page would require about 10,000 bytes or 10 sectors of size 1024 bytes.
15. The novel would require about 1.4MB using ASCII and about 2.8MB if Unicode were used.
16. The latency time of a disk spinning at 60 revolutions per second is only 0.0083 seconds.
17. About 18.3 milliseconds.
18. About 7 years!
19. What does it say?
20. hexadecimal
21. a.
1 0 0 / 5
00110001 00110000 00110000 00101111 00110101
= 2 0
00100000 00111101 00100000 00110010 00110000
b.
T o b e
01010100 01101111 00100000 01100010 01100101
o r n
00100000 01101111 01110010 00100000 01101110
o t t o
01101111 01110100 00100000 01110100 01101111
b e ?
00100000 01100010 01100101 00111111
5
c.
T h e t
01010100 01101000 01100101 00100000 01110100
o t a l
01101111 01110100 01100001 01101100 00100000
c o s t
01100011 01101111 01110011 01110100 00100000
i s $ 7
01101001 01110011 00100000 00100100 00110111
. 2 5 .
00101110 00110010 00110101 00101110
22. a. 31 30 30 2F 3D 20 3D 20 32 30
b. 54 6F 20 62 65 20 6F 72 20 6E 6F 74 20 74 6F 20 62 65 3F
c. 54 68 65 20 74 6F 74 61 6C 20 63 6F 73 74 20 69 73 20 24
37 2E 32 35 2E8
23. 110, 111, 1000, 1001, 1010, 1011, 1100, 1101, 1110, 1111, 10000
24. a. 00110010 00110110 b. 11010
25. They are the powers of two. 1 10 100 1000 10000 100000
26. a. 7 b. 1 c. 21 d. 17 e. 19 f. 0 g. 4 h. 8 i. 16 j. 25 k. 26 l. 27
27. a. 111 b. 1011 c. 10000 d. 1111 e. 100001
28. a. 0 b. 3 c. -3 d. -1 e. 7
29. a. 100 b. 111 c. 001 d. 011 e. 101
30. a. 15 b. -13 c. 13 d. -16 e. -9
31. a. 0001100 b. 1110100 c. 1111111 d. 0000000 e. 0001000
32. a. 01101 b. 00000 c. 10000 (incorrect) d. 10001 e. 01110
f. 10011 (incorrect) g. 11110 h. 01101 i. 10000 (incorrect) j. 11111
33. a.
7 00111
+1 becomes + 00001
01000 which represents 8
b.
7 00111 00111
- 1 becomes - 00001 which converts to + 11111
00110 which represents 6
c.
12 01100 01100
- 4 becomes - 00100 which converts to + 11100
01000 which represents 8
6
d.
8 01000 01000
- 7 becomes - 00111 which converts to + 11001
00001 which represents 1
e.
12 01100
+ 4 becomes + 00100
10000 which represents -16 (overflow)
f.
5 00101 00101
- 11 becomes - 01011 which converts to + 10101
11010 which represents -6
34. a. 3 1/4 b. 4 5/16 c. 13/16 d. 1 e. 2 1/8
35. a. 101.01 b. .0001 c. 111.111 d. 1.11 e. 110.101
36. a. 1 1/4 b. -1/2 c. 3/16 d. -9/32
37. a. 01001000 b. 01111111 c. 11101111
d. 00101010 e. 00011111 (truncation)
38. 00111100, 01000110, and 01010011
39. The best approximation of the square root of 2 is 1 3/8 represented as 01011011. The square of
this value when represented in floating-point format is 01011111, which is the representation of 1
7/8.
40. The value one-eighth, which would be represented as 00101000.
41. Since the value one-tenth cannot be represented accurately, such recordings would suffer from
truncation errors.
42. From left to right the result would be 2 3/4. From right to left the result would be 2 1/2.
43. a. 1 5/8 b. 4 c. 3 1/4
44. a. 01101100 b. 01101000 c. 01111000 (truncation) d. 01101011
45. a. The value is either eleven or negative five.
b. A value represented in two's complement notation can be changed to excess notation by
changing the high-order bit, and vice versa.
46. The value is two; the patterns are floating-point, excess, and two's complement, respectively.
47. a. This is the value -5 coded in floating-point, excess 8, and two's complement notation,
respectively.
b. This is the value -3 coded in two's complement, excess 128, and floating-point notation,
respectively.
c. This is the value 2 coded in excess 8, two's complement (or binary), and floating-point
notation, respectively.
48. Only bit patterns of length 5 are valid excess 16 representations. Thus, 101, 010101, 1000, 000000,
and 1111 are not valid.
49. b would require too large of an exponent. c would require too many significant digits. d would
require too many significant digits.
7
50. When using binary notation, the largest value that could be represented would change from 15
to 255. When using two's complement notation the largest value that could be represented would
change from 7 to 127.
51. 4FFFFF
52. Use the first and second inputs as inputs to an XOR gate. Do the same with the third and fourth
inputs. Then, tie the outputs of these two XOR gates to the inputs of a third XOR gate.
53. 1123221343435
54. yyxy xx yyxy xyx xx xyx
55. Starting with the first entries, they would be x, y, space, xxy, yyx, and xxyy.
56. Not a chance. MPEG requires transfer rates of 40 Mbps.
57. a.
1 0 0 / 5
00110001 10110000 10110000 00101111 10110101
= 2 0
00011101 00110010 10110000
b. T o b e
01010100 11101111 00100000 01100010 11100101
o r n
00100000 11101111 11110010 00100000 01101110
o t t o
11101111 11110100 00100000 11110100 11101111
b e ?
00100000 01100010 11100101 10111111
c. T h e t
01010100 01101000 11100101 00100000 11110100
o t a l
11101111 11110100 01100001 11101100 00100000
i s $ 7
11101001 01110011 00100000 10100100 00110111
. 2 5 .
10101110 00110010 10110101 10101110
58. The underlined strings definitely contain errors.
11001 11011 10110 00000 11111 10001 10101 00100 01110
59. The code would have a Hamming distance of 3. Thus, by using it, one could detect up to 2 errors
per character and correct up to 1 error per character.
60. a. HE b. FED c. DEAD d. CABBAGE e. CAFE