Books on string algorithms

There have been numerous posts on string algorithms:

However, no general literature was mentioned.

Could anyone recommend a book(s) that would thoroughly explore various string algorithms? The topic which is of special interest is approximate string matching [things like google-offered corrected search string variants :) ].

Thanks a lot for advice.

Answers


I'm surprised no-one has mentioned Dan Gusfield's excellent book Algorithms on Strings, Trees and Sequences which covers string algorithms in more detail than anyone would probably need. It served me very well for a project on protein sequencing that I was working on a few years ago. After reading this book you will learn:

  • Naive String Matching
  • Preprocessor Based algorithms (Boyer Moore, Knuth-Morris-Pratt)
  • Regex matching algorithms
  • Karp-Rabin and similar methods
  • Suffix tree methods (Ukkonen's method, etc)
  • Sequence alignment (Levenshtein distance and string similarity, and multiple sequence alignment)
  • Applications to DNA sequencing, gene prediction and other areas.

This is not a book recommendation, but this library and site is a library that offers plenty of efficient string matching algorithm implementations:

http://www.dcs.shef.ac.uk/~sam/simmetrics.html

It also provides links to further learning for each and where each is best applicable.


Jewels of Stringology


CLR has some string processing algorithms, but it's not specific to them.

Including:


TRE is an open-source library that implements approximate matching. The About page has some interesting hints about how it works, although I'm not sure it provides the sort of in-depth analysis you're looking for. The source code is probably more enlightening from that perspective.


Need Your Help

V8: console.log implementation

javascript v8

I'm using V8 in my C++ app and I would like to add console.log(). Is there some good standard implementation I can use?

js, make a json object a string - the easy way?

javascript json string

I have tried lots of different ways, and using many diffent PHP and JS functions to try to achieve this.