Discrete Mathematics: A Foundation for Computer Science

by David Mix Barrington

Preface for the Instructor

Overview and Motivation

Most computer science undergraduate programs have a required course in ``Discrete Mathematics''. Such a course generally has two sometimes competing goals: To develop the mathematical maturity required to understand and construct the formal arguments in more advanced computer science, and to cover the specific material needed in later courses (particularly theory courses), usually some combination of logic, set theory, number theory, combinatorics, probability, and graph theory.

Most instructors of my acquaintance consider this to be a ``problem course'' in the curriculum. The students tend not to like it because they would rather be learning programming than mathematical thinking --- it is unmotivated with respect to the rest of computer science. This is often exacerbated by the cultural gulf between an instructor who is a mathematician and students who want to be computer scientists, especially if it is taught as a service course by a mathematics department. A further problem is that the usual organization of the course is disjointed --- with so many different topics (``If it's Tuesday, this must be graph theory...'') it hasn't any single theme.

There is an obvious way to solve both these problems. Choose some unarguably computer science topic which involves so many of the standard discrete mathematics subject areas that a discrete mathematics course can be built around it. If this works, the computer science students will be more involved and the course will have a unifying theme. This is not a new idea, and several books have tried this with several computer science themes, but this book represents a substantially different approach that may be particularly suited to computer science programs.

This book offers three different ways to tell a complete story about computation that begins with the fundamentals of discrete mathematics:

Any of these course structures leads to a course that deals with computation constantly, but develops mathematical sophistication by leading the students all the way from the basic notion of proof to some significant real mathematics.

Highlights of the Approach

Prerequisites

This book is an outgrowth of the author's development of two particular courses, Computer Science 240 and 250, at the University of Massachusetts Amherst. CMPSCI 250, Introduction to Computation, has corequisites of second-semester calculus and data structures (second-semester programming) and CMPSCI 240, Reasoning About Uncertainty, has both these courses as prerequisites. The math prerequisite of the former, at least, is not justified by any calculus content in most chapters of the book beyond a few throwaway references, but serves to ensure some minimum level of skill in calculation. (Chapters 7-11 make more direct reference to calculus.) The programming prerequisite is more important, because much of the material refers specifically to constructs in a modern imperative and object-oriented programming language, including recursion. The programming examples and exercises refer specifically to Java, but they primarily deal with aspects of Java that are not terribly different in C or C++.

The primary target audience of this book is sophomore and junior computer science majors, though it should be usable in a wider range of settings. It has been used at the University of Massachusetts for a 14-week, 4-credit one-semester course, Computer Science 250. This course in Spring 2006, for example, used 53 of the 1200 lecture sections in the book, including a core of 1, 2, 3.1-3.6, 4, 5.1-5.5, 6.1-6.7,and 14, with various different additional sections. Computer Science 240 in Spring 2008 used nearly all of Chapters 6, 10, 11, and 12.

This book is intended to be a primary text for a course. Two supplementary texts are recommended: How to Read and Do Proofs by Solow for those needing additional familiarization with the concept of proof, and Godel, Escher, Bach: An Eternal Golden Braid by Hofstadter, both for its philosophical examinations of computer science as a whole (and artificial intelligence in particular) and for its more detailed treatment of first-order number theory. Footnotes in this text refer to Hofstadter and Solow where appropriate, but the presentation here is intended to be self-contained.

Guide for Selection of Sections

Each of the fifteen chapters has eight ordinary sections and three Excursions. An excursion is a suggested tangential or supplementary topic, related to the main lecture material in the sections. At UMass, the class using this book meets for four 50-minute classes per week, three regular lecture sections and one ``discussion section'' with the instructor. In the latter the students carry out the writing assignment from an Excursion with the supervision and intervention of the instructor as necessary. Most Excursion writing assignments would also be suitable for in-class or take-home group projects.

A 50-minute lecture will usually cover one but sometimes two ordinary sections, depending on the level of sophistication of the students and their familiarity with the material. Usually there is more mathematical material in the book than could be covered in that time. Some sections may demand more than one lecture, depending on the instructor's taste, and some can be omitted. In general the instructor can decide to do a few, most, or all of the sections in each chapter, following the specific guidelines below.

Chapter 1 begins the treatment of the language of mathematics with the two most fundamental concepts, sets and propositions, and the relationship between them. Proofs in the propositional calculus are treated in some detail with many examples. The chapter concludes with the introduction of predicates and practice in translating between symbolic language and English. All this material is necessary, but more sophisticated students should be able to pass through it more quickly. Of the three Excursions, 1.3 is a more philosophical introduction to the course while 1.9 and 1.11 are direct practice in important skills.

Chapter 2 introduces the predicate calculus and applies it to properties of relations. Students learn what quantifiers are, what they mean, how to prove quantified statements, and how they are used to define several important concepts. All of this chapter is necessary though section 2.5 can be postponed if the regular language material of Chapter 5 is to be done later. The last two sections each contain a substantial proof. Section 2.10 proves that Hasse diagrams provide a general notation for finite partial orders, and 2.11 relates equivalence relations to partitions. The first proof is not directly used later, but gives a good early example of some real mathematics. The second proof should definitely be included. Of the three Excursions, 2.2 is a Java-based tangent while 2.4 and 2.7 provide important practice.

Chapter 3 deals with number theory without using induction. Many instructors with mathematical training tend to overestimate students' familiarity with ideas like prime numbers. Though the material is not heavily used later, this chapter can go a long way to introduce students to mathematical culture. Sections 3.1-3.6 show why unique factorization into primes and the Chinese Remainder Theorem are true, from first principles. Sections 3.7-3.11 are more sophisticated (except for the basic cryptography in 3.11) but can be very well motivated for students wanting some understanding of, for example, public-key cryptography. The Excursions are rather diverse: 3.2 offers tricks for hand calculation of divisibility, 3.7 explores a "hard problem" from Hofstadter, and 3.10 applies the algebraic machinery developed for RSA to look at a question in computational complexity.

Chapter 4, on induction, is in many ways the core of the book. It is shown how induction is part of the definition of the system of natural numbers, and works in any system with a similar definition. The relationships among inductive definition, inductive proof, and recursive algorithms are stressed throughout. The Excursions are fairly central to the argument. The last sections, 4.9-4.11, provide a very brief introduction to graph theory. They could substitute for Chapter 8 if that is to be entirely omitted, or be skipped if graph theory is to be covered elsewhere in the curriculum.

Chapter 5 applies induction and recursive definition to regular languages (5.1-5.5) and various other structures. The first five sections are necessary for Chapter 9, but the last six are each optional. 5.10 and 5.11 introduce the application of discrete mathematical reasoning to actual computer programs --- for example 5.10 proves that any program with IFs and GOTOs can be simulated by a program of WHILEs and IF-THEN-ELSEs.

Chapter 6 covers the basic principles of combinatorics at a rather relaxed pace, so that this material can be taken slowly (one section per lecture) or quickly (two sections per lecture). One section is devoted to each of the four basic counting problems, counting sequences with or without order and with or without replacement. Excursion 6.5 revisits the sorting problem, familiar from a data structures course, and relates it to counting permutations. Excursion 6.8 gives some practical hints on generating lists of permutations and combinations with a computer. Finally, 6.9-6.11 explore the idea of combinatorial proof that is central to further work in combinatorics.

Chapter 7 explores three further topics in combinatorics, the analysis of recurrences (7.1-7.4), asymptotic notation (7.5-7.7), and infinite sets (7.8-7.11). The first two are important preparation for a future course in the analysis of algorithms, though different curricula will place different demands on the discrete mathematics course in these areas. The last topic is of little direct importance but introduces the diagonal argument that recurs in computability theory (for example, in our Chapter 15). Each of these three topics has an Excursion.

Chapter 8 on graph theory can be used flexibly. Sections 8.1-8.3 cover graphs, their representation by adjacency matrices, and the theorem relating paths in graph to matrix multiplication. These sections are very useful, though not absolutely required, for Chapter 14. Sections 8.4 and 8.5 look at further implications of the matrix/path theorem. Finally, Sections 8.6-8.11 cover the classical topics of graph theory (such as coloring and planarity) and an instructor may pick and choose freely among them.

Chapter 9 deals with trees primarily as a model for search algorithms, after a review of the definitions and induction techniques (9.1-9.3). Breadth-first and depth-first search are treated in two ways -- as general search algorithms as in AI (9.5) and in the special case where visited states can be marked (9.6), as in an algorithms course. The chapter concludes with a look at two search algorithms common in AI (9.8, 9.9) and a brief look at adversary search (9.10, 9.11).

Chapter 10 is an introduction to discrete probability and lays the groundwork for Chapters 11, 12, and 13. A minimal treatment of the area could be obtained with just the sections on definitions and expected value (10.1-10.4), concluding with an Excursion computing the winning probability in Craps, a dice game. The middle of the chapter (10.5-10.7) explains variance and standard deviation, explaining how results from continuous probability may be used to solve discrete problems. Finally, the remaining sections (10.8-10.11) present some more advanced methods to bound the probability of particular events based on analysis of a discrete distribution.

Chapter 11 is a more advanced treatment of reasoning techniques seen in AI. It begins with a thorough treatment of Bayes' Theorem (11.1-11.4) and its use to update estimates of probability based on new evidence. It then considers the Naive Bayes Classifier, a simple learning algorithm used, for example, in spam filters (11.5-11.8). Finally, simulation is considered as an alternate method of estimating probabilities (11.9-11.11).

Chapter 12 deals with Markov chains, defining them and arguing that they generally reach a steady-state distribution (12.1-12.4). Markov decision processes are then considered as an example of how to act in a situation where the results depend both on the decisions and on random events (12.5-12.8). The chapter concludes with the classical theory of two-player simultaneous-move games, where optimal strategies are usually probabilistic (12.9-12.11).

Chapter 13 considers information theory, a fundamental topic for computer science that is not usually considered in lower-level courses. The initial treatment is Sections 13.1-13.5 gives useful background for coding and data compression. The remaining sections are more advanced, introducing the concept of entropy and exploring the limits of representing information streams by coding.

Chapter 14 is a thorough treatment of finite-state machines, proving the Myhill-Nerode Theorem (14.2-14.3) and Kleene's Theorem (equivalence of FSM's and regular languages, 14.5-14.10). I would include the entire chapter, except possibly for the somewhat tangential Excursions 14.4 and 14.11, in any course that culminates with this topic. Excursion 14.9 works an example of most of the constructions in Kleene's Theorem and is fairly important.

Chapter 15 is a brief look at topics in computability and complexity theory beyond finite-state machines. Since a course based on this book might be the last course in the theory of computation for many students, an instructor might want to say a little about one or more of grammars and context-free languages (15.2-15.5), general computability (15.6-15.10), or complexity (15.11). This chapter is no substitute for a follow-on course, naturally, but it might also serve to whet student's interest. Section 15.1 on two-way automata presents a self-contained result that is somewhat neglected and not directly relevant to what follows. But it is fairly simple to do after the Myhill-Nerode Theorem, and is a nice accessible example of idea of an abstract model, which is central to the rest of the chapter.

Exercises and Problems

Each ordinary section contains at least five Exercises and at least five Problems. The former are intended to be direct applications of concepts in the section, suitable generally more for student review than for graded homework. The Problems are more sophisticated, sometimes laying the groundwork for concepts to be encountered later. Some Exercises and some Problems involve the use of Java, and this is indicated at the beginning of each. Each Excursion contains one or more suggested Writing Exercises. Instructors should consider these carefully to decide how to use them.

Chapter Glossaries

Every new term is indicated in bold face when it is first introduced in the text. At the end of each chapter is an index giving the page number for each of these defined terms. The index provides a useful study guide to the student, since one main goal of the course is the acquisition of new vocabulary.

Author Information

I am by training a mathematician, dabbling in topology and set theory in graduate school at M.I.T. before settling on computational complexity theory. My thesis, under the direction of Mike Sipser, was a characterization of the computing power of constant-width branching programs --- I found a way in which polynomial-size constant-width programs could compute things previously believed to be beyond them. My subsequent research (with several collaborators) has uncovered surprising connections between complexity theory and some of the subjects of this book: finite-state machines and first-order logic.

In 1986 I joined the computer science faculty of the University of Massachusetts at Amherst, a large research university where the undergraduates range widely in ability. I've developed a strong interest in and affection for undergraduate teaching, particularly the task of motivating computer science students to think ``like mathematicians'' to the extent necessary for later work in their own field. I have come to know our students and their problems very well, as both a teacher and academic advisor. Discrete mathematics became my favorite course to teach, as I have offered it several times at UMass and once each (to different student populations) at the University of Washington and at Mount Holyoke College.

This particular project originally arose from an overhaul of our undergraduate curriculum, in which we decided that our discrete mathematics course needed to come earlier in the sequence, be oriented toward computer science, and yet be more mathematically rigorous than its predecessor. With the help of a Lilly Teaching Fellowship, I spent the academic year of 1994-95 both developing this particular course and experimenting with innovative teaching methods.

More recently we have revisited the undergraduate curriculum and decided to split "discrete mathematics" into two courses, including as well material on general search and probabilistic reasoning that had previously been deferred to the upper-level AI course. Chapters 9-13 are one result of my course development efforts in this regard.

Last modified 29 December 2008