Natural Language Processing with Python

Natural Language Processing with Python

This book is a practical introduction to NLP. You will learn by example, write real programs, and grasp the value of being able to test an idea through implementation.

If you haven’t learned already, this book will teach you programming.

Unlike other programming books, we provide extensive illustrations and exercises from NLP.

The approach we have taken is also principled, in that we cover the theoretical underpinnings and don’t shy away from careful linguistic and computational analysis.

We have tried to be pragmatic in striking a balance between theory and application, identifying the connections and the tensions.

Finally, we recognize that you won’t get through this unless it is also pleasurable, so we have tried to include many applications and examples that are interesting and entertaining, and sometimes whimsical.

Note that this book is not a reference work.

Its coverage of Python and NLP is selective, and presented in a tutorial style.

For reference material, please consult the substantial quantity of searchable resources available at http://python.org/ and http://www.nltk .org/.

This book is not an advanced computer science text.

The content ranges from introductory to intermediate, and is directed at readers who want to learn how to analyze text using Python and the Natural Language Toolkit.

To learn about advanced algorithms implemented in NLTK, you can examine the Python code linked from http:// www.nltk.org/, and consult the other materials cited in this book.

What You Will Learn By digging into the material presented here, you will learn:

• How simple programs can help you manipulate and analyze language data, and how to write these programs

• How key concepts from NLP and linguistics are used to describe and analyze language

• How data structures and algorithms are used in NLP

• How language data is stored in standard formats, and how data can be used to evaluate the performance of NLP techniques