Up Next

This HTML version of Think Data Structures is provided for convenience, but it is not the best format of the book. In particular, some of the symbols are not rendered correctly.

You might prefer to read the PDF version.

Or you can buy this book on Amazon.com.

Preface

The philosophy behind the book

Data structures and algorithms are among the most important inventions of the last 50 years, and they are fundamental tools software engineers need to know. But in my opinion, most of the books on these topics are too theoretical, too big, and too “bottom up”:

Too theoretical
Mathematical analysis of algorithms is based on simplifying assumptions that limit its usefulness in practice. Many presentations of this topic gloss over the simplifications and focus on the math. In this book I present the most practical subset of this material and omit or de-emphasize the rest.
Too big
Most books on these topics are at least 500 pages, and some are more than 1000. By focusing on the topics I think are most useful for software engineers, I kept this book under 200 pages.
Too “bottom up”
Many data structures books focus on how data structures work (the implementations), with less about how to use them (the interfaces). In this book, I go “top down”, starting with the interfaces. Readers learn to use the structures in the Java Collections Framework before getting into the details of how they work.

Finally, some books present this material out of context and without motivation: it’s just one damn data structure after another! I try to liven it up by organizing the topics around an application — web search — that uses data structures extensively, and is an interesting and important topic in its own right.

This application motivates some topics that are not usually covered in an introductory data structures class, including persistent data structures with Redis.

I have made difficult decisions about what to leave out, but I have made some compromises. I include a few topics that most readers will never use, but that they might be expected to know, possibly in a technical interview. For these topics, I present both the conventional wisdom as well as my reasons to be skeptical.

This book also presents basic aspects of software engineering practice, including version control and unit testing. Most chapters include an exercise that allows readers to apply what they have learned. Each exercise provides automated tests that check the solution. And for most exercises, I present my solution at the beginning of the next chapter.

0.1  Prerequisites

This book is intended for college students in computer science and related fields, as well as professional software engineers, people training in software engineering, and people preparing for technical interviews.

Before you start this book, you should know Java pretty well; in particular, you should know how to define a new class that extends an existing class or implements an interface. If your Java is rusty, here are two books you might start with:

  • Downey and Mayfield, Think Java (O’Reilly Media, 2016), which is intended for people who have never programmed before.
  • Sierra and Bates, Head First Java (O’Reilly Media, 2005), which is appropriate for people who already know another programming language.

If you are not familiar with interfaces in Java, you might want to work through the tutorial called “What Is an Interface?” at http://thinkdast.com/interface.

One vocabulary note: the word “interface” can be confusing. In the context of an application programming interface (API), it refers to a set of classes and methods that provide certain capabilities.

In the context of Java, it also refers to a language feature, similar to a class, that specifies a set of methods. To help avoid confusion, I’ll use “interface” in the normal typeface for the general idea of an interface, and interface in the code typeface for the Java language feature.

You should also be familiar with type parameters and generic types. For example, you should know how create an object with a type parameter, like ArrayList<Integer>. If not, you can read about type parameters at http://thinkdast.com/types.

You should be familiar with the Java Collections Framework (JCF), which you can read about at http://thinkdast.com/collections. In particular, you should know about the List interface and the classes ArrayList and LinkedList.

Ideally you should be familiar with Apache Ant, which is an automated build tool for Java. You can read more about Ant at http://thinkdast.com/anttut.

And you should be familiar with JUnit, which is a unit testing framework for Java. You can read more about it at http://thinkdast.com/junit.

Working with the code

The code for this book is in a Git repository at http://thinkdast.com/repo.

Git is a “version control system” that allows you to keep track of the files that make up a project. A collection of files under Git’s control is called a “repository”.

GitHub is a hosting service that provides storage for Git repositories and a convenient web interface. It provides several ways to work with the code:

  • You can create a copy of the repository on GitHub by pressing the Fork button. If you don’t already have a GitHub account, you’ll need to create one. After forking, you’ll have your own repository on GitHub that you can use to keep track of code you write. Then you can “clone” the repository, which downloads a copy of the files to your computer.

  • Alternatively, you could clone the repository without forking. If you choose this option, you don’t need a GitHub account, but you won’t be able to save your changes on GitHub.
  • If you don’t want to use Git at all, you can download the code in a ZIP archive using the Download button on the GitHub page, or this link: http://thinkdast.com/zip.

After you clone the repository or unzip the ZIP file, you should have a directory called ThinkDataStructures with a subdirectory called code.

The examples in this book were developed and tested using Java SE Development Kit 7. If you are using an older version, some examples will not work. If you are using a more recent version, they should all work.

Contributors

This book is an adapted version of a curriculum I wrote for the Flatiron School in New York City, which offers a variety of online classes related to programming and web development. They offer a class based on this material, which provides an online development environment, help from instructors and other students, and a certificate of completion. You can find more information at http://flatironschool.com.

  • At the Flatiron School, Joe Burgess, Ann John, and Charles Pletcher provided guidance, suggestions, and corrections from the initial specification all the way through implementation and testing. Thank you all!
  • I am very grateful to my technical reviewers, Barry Whitman, Patrick White, and Chris Mayfield, who made many helpful suggestions and caught many errors. Of course, any remaining errors are my fault, not theirs!
  • Thanks to the instructors and students in Data Structures and Algorithms at Olin College, who read this book and provided useful feedback.

If you have comments or ideas about the text, please send them to: feedback@greenteapress.com.

Are you using one of our books in a class?

We'd like to know about it. Please consider filling out this short survey.


Think Data Structures

Think DSP

Think Java

Think Bayes

Think Python 2e

Think Stats 2e

Think Complexity


Up Next