正在加载...

Text Processing in Python

Text Processing in Python

作者:David Mertz

出版日期:June 06, 2003

页数:544

ISBN:0-321-11254-7

本书永久链接:http://www.ppurl.com/2008/09/text-processing-in-python.html

共享用户信息

 
admin
admin6,805
901615,744

书籍简介

Text Processing in Python is an example-driven, hands-on tutorial that carefully teaches programmers how to accomplish numerous text processing tasks using the Python language. Filled with concrete examples, this book provides efficient and effective solutions to specific text processing problems and practical strategies for dealing with all types of text processing challenges.

Text Processing in Python begins with an introduction to text processing and contains a quick Python tutorial to get you up to speed. It then delves into essential text processing subject areas, including string operations, regular expressions, parsers and state machines, and Internet tools and techniques. Appendixes cover such important topics as data compression and Unicode. A comprehensive index and plentiful cross-referencing offer easy access to available information. In addition, exercises throughout the book provide readers with further opportunity to hone their skills either on their own or in the classroom. A companion Web site (http://gnosis.cx/TPiP) contains source code and examples from the book.

Here is some of what you will find in thie book:

  • When do I use formal parsers to process structured and semi-structured data? Page 257

  • How do I work with full text indexing? Page 199

  • What patterns in text can be expressed using regular expressions? Page 204

  • How do I find a URL or an email address in text? Page 228

  • How do I process a report with a concrete state machine? Page 274

  • How do I parse, create, and manipulate internet formats? Page 345

  • How do I handle lossless and lossy compression? Page 454

  • How do I find codepoints in Unicode? Page 465

+ 展开目录
     Copyright
     Preface
        Section 0.1.  What Is Text Processing?
        Section 0.2.  The Philosophy of Text Processing
        Section 0.3.  What You'll Need to Use This Book
        Section 0.4.  Conventions Used in This Book
        Section 0.5.  A Word on Source Code Examples
        Section 0.6.  External Resources
 
     Acknowledgments
     Chapter 1.  Python Basics
        Section 1.1.  Techniques and Patterns
        Section 1.2.  Standard Modules
        Section 1.3.  Other Modules in the Standard Library
 
     Chapter 2.  Basic String Operations
        Section 2.1.  Some Common Tasks
        Section 2.2.  Standard Modules
        Section 2.3.  Solving Problems
 
     Chapter 3.  Regular Expressions
        Section 3.1.  A Regular Expression Tutorial
        Section 3.2.  Some Common Tasks
        Section 3.3.  Standard Modules
 
     Chapter 4.  Parsers and State Machines
        Section 4.1.  An Introduction to Parsers
        Section 4.2.  An Introduction to State Machines
        Section 4.3.  Parser Libraries for Python
 
     Chapter 5.  Internet Tools and Techniques
        Section 5.1.  Working with Email and Newsgroups
        Section 5.2.  World Wide Web Applications
        Section 5.3.  Synopses of Other Internet Modules
        Section 5.4.  Understanding XML
 
     Appendix A.  A Selective and Impressionistic Short Review of Python
        Section A.1.  What Kind of Language Is Python?
        Section A.2.  Namespaces and Bindings
        Section A.3.  Datatypes
        Section A.4.  Flow Control
        Section A.5.  Functional Programming
 
     Appendix B.  A Data Compression Primer
        Section B.1.  Introduction
        Section B.2.  Lossless and Lossy Compression
        Section B.3.  A Data Set Example
        Section B.4.  Whitespace Compression
        Section B.5.  Run-Length Encoding
        Section B.6.  Huffman Encoding
        Section B.7.  Lempel Ziv-Compression
        Section B.8.  Solving the Right Problem
        Section B.9.  A Custom Text Compressor
        Section B.10.  References
 
     Appendix C.  Understanding Unicode
        Section C.1.  Some Background on Characters
        Section C.2.  What Is Unicode?
        Section C.3.  Encodings
        Section C.4.  Declarations
        Section C.5.  Finding Codepoints
        Section C.6.  Resources
 
     Appendix D.  A State Machine for Adding Markup to Text
     Appendix E.  Glossary

Tags:

相关书籍

你需要 登录 后才能查看和发布评论。如果还不是本站用户,请先 注册